Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
58 commits
Select commit Hold shift + click to select a range
0f04b78
Add files via upload
madhuri-pc30 Oct 9, 2025
1bba8f0
Studio Part 1
madhuri-pc30 Oct 10, 2025
37dc142
Demo File
madhuri-pc30 Oct 10, 2025
df6bdc0
Merge pull request #1 from madhuri-pc30/assignement
madhuri-pc30 Oct 10, 2025
98b5697
adding comment
madhuri-pc30 Oct 10, 2025
984b08b
Merge pull request #2 from madhuri-pc30/assignement
madhuri-pc30 Oct 10, 2025
11fe103
gi
madhuri-pc30 Oct 14, 2025
f78c164
Merge pull request #3 from madhuri-pc30/assignment
madhuri-pc30 Oct 14, 2025
79aef53
Demo practice file
madhuri-pc30 Oct 15, 2025
85d923d
Merge pull request #4 from madhuri-pc30/assignement
madhuri-pc30 Oct 15, 2025
f15351f
Sql-part3-exercise
madhuri-pc30 Oct 16, 2025
71ca926
Merge pull request #5 from madhuri-pc30/assignement
madhuri-pc30 Oct 16, 2025
771d180
SQL-Part4
madhuri-pc30 Oct 21, 2025
3338a7e
Merge pull request #6 from madhuri-pc30/assignement
madhuri-pc30 Oct 21, 2025
be1d3e7
SQL-part4-Studio
madhuri-pc30 Oct 22, 2025
1f340a6
Merge pull request #7 from madhuri-pc30/assignement
madhuri-pc30 Oct 22, 2025
4d2bbdb
corrected 2nd question
madhuri-pc30 Oct 22, 2025
1be58de
Merge pull request #8 from madhuri-pc30/assignement
madhuri-pc30 Oct 22, 2025
f38fd45
SQL-part -5 studio
madhuri-pc30 Oct 24, 2025
10a5ed9
Merge pull request #9 from madhuri-pc30/assignement
madhuri-pc30 Oct 24, 2025
d91648f
SQL-5 -exercise
madhuri-pc30 Oct 27, 2025
dabc277
Merge pull request #10 from madhuri-pc30/assignement
madhuri-pc30 Oct 27, 2025
b379968
Add files via upload
madhuri-pc30 Nov 3, 2025
6c1825c
Merge pull request #11 from madhuri-pc30/assignement
madhuri-pc30 Nov 3, 2025
83ec0a8
saving python
madhuri-pc30 Nov 3, 2025
7148b89
Merge pull request #12 from madhuri-pc30/assignement
madhuri-pc30 Nov 3, 2025
6db10cf
booleans_conditionals_exercise
madhuri-pc30 Nov 5, 2025
bb042e5
Merge pull request #13 from madhuri-pc30/assignement
madhuri-pc30 Nov 5, 2025
7d7461c
For_loop
madhuri-pc30 Nov 6, 2025
ce70839
Merge pull request #14 from madhuri-pc30/assignement
madhuri-pc30 Nov 6, 2025
4c91300
studio_assignment
madhuri-pc30 Nov 10, 2025
2edc66d
Merge pull request #15 from madhuri-pc30/assignement
madhuri-pc30 Nov 10, 2025
f86055a
functions-studio
madhuri-pc30 Nov 14, 2025
4dcdccc
Merge pull request #16 from madhuri-pc30/assignement
madhuri-pc30 Nov 14, 2025
7364279
studio-assignement
madhuri-pc30 Nov 20, 2025
b573028
Merge pull request #17 from madhuri-pc30/assignement
madhuri-pc30 Nov 20, 2025
4ef4f17
new changes
madhuri-pc30 Nov 21, 2025
4fb010e
Merge pull request #18 from madhuri-pc30/assignement
madhuri-pc30 Nov 21, 2025
48abc7c
exercise
madhuri-pc30 Nov 21, 2025
bd4a820
Merge pull request #19 from madhuri-pc30/assignement
madhuri-pc30 Nov 21, 2025
4d10e69
cleaning_data_assignment
madhuri-pc30 Dec 2, 2025
6442983
Merge pull request #20 from madhuri-pc30/assignement
madhuri-pc30 Dec 2, 2025
71a77aa
Studio_assignment
madhuri-pc30 Dec 3, 2025
d106d5d
Merge pull request #21 from madhuri-pc30/assignement
madhuri-pc30 Dec 3, 2025
09a21c4
Data_Manipulation_assignment
madhuri-pc30 Dec 3, 2025
35d4b50
Merge pull request #22 from madhuri-pc30/assignement
madhuri-pc30 Dec 3, 2025
429aad6
Studio_assignment
madhuri-pc30 Dec 3, 2025
2abcd6d
Merge pull request #23 from madhuri-pc30/assignement
madhuri-pc30 Dec 3, 2025
c879634
project 3
madhuri-pc30 Dec 8, 2025
a3b241e
Merge pull request #24 from madhuri-pc30/assignement
madhuri-pc30 Dec 8, 2025
fbe2ee8
project 3
madhuri-pc30 Dec 8, 2025
96e06f7
test change
madhuri-pc30 Dec 8, 2025
d0828f5
project 3
madhuri-pc30 Dec 8, 2025
64ba9b6
Merge branch 'main' of https://github.com/madhuri-pc30/data-analysis-…
madhuri-pc30 Dec 8, 2025
1c54123
Merge branch 'main' of https://github.com/madhuri-pc30/data-analysis-…
madhuri-pc30 Dec 8, 2025
c02f96c
Merge branch 'main' into assignement
madhuri-pc30 Dec 8, 2025
5a1ec8d
Final_project_Healthcare
madhuri-pc30 Dec 12, 2025
2967877
Merge branch 'assignement' of https://github.com/madhuri-pc30/data-an…
madhuri-pc30 Dec 12, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
337 changes: 337 additions & 0 deletions 9_to_5_sql_project.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,337 @@
--1. Use this space to make note of each table in the database, the columns within each table, each column’s data type, and how the tables are connected. You can write this down or draw a diagram. Whatever method helps you get an
--understanding of what is going on with `LaborStatisticsDB`.
--Answer-
--explore the LaborStatisticsDB database

--Step 1: List all tables in the database

SELECT TABLE_NAME
FROM LaborStatisticsDB.INFORMATION_SCHEMA.TABLES
WHERE TABLE_TYPE = 'BASE TABLE';
--9 rows affected

--Step 2: List columns and data types for each table
SELECT COLUMN_NAME, DATA_TYPE
FROM LaborStatisticsDB.INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'annual_2016';

SELECT COLUMN_NAME, DATA_TYPE
FROM LaborStatisticsDB.INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'datatype';

SELECT COLUMN_NAME, DATA_TYPE
FROM LaborStatisticsDB.INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'footnote';

SELECT COLUMN_NAME, DATA_TYPE
FROM LaborStatisticsDB.INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'industry';

SELECT COLUMN_NAME, DATA_TYPE
FROM LaborStatisticsDB.INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'january_2017';

SELECT COLUMN_NAME, DATA_TYPE
FROM LaborStatisticsDB.INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'period';

SELECT COLUMN_NAME, DATA_TYPE
FROM LaborStatisticsDB.INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'seasonal';

SELECT COLUMN_NAME, DATA_TYPE
FROM LaborStatisticsDB.INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'series';

SELECT COLUMN_NAME, DATA_TYPE
FROM LaborStatisticsDB.INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'supersector';

--Step 3: Identify relationships between tables

--series_id appears in both series and emplyoement table , it’s likely a relationship.


--2. What is the datatype for women employees?

SELECT TOP 10 *
FROM LaborStatisticsDB.dbo.datatype;

SELECT COLUMN_NAME, DATA_TYPE
FROM LaborStatisticsDB.INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'datatype'
AND COLUMN_NAME LIKE '%women%';

SELECT *
FROM LaborStatisticsDB.dbo.datatype;

SELECT TOP 10 series_id, data_type_code
FROM LaborStatisticsDB.dbo.series;


SELECT TOP 20 *
FROM LaborStatisticsDB.dbo.series
WHERE data_type_code = '10';

--Answer -The data type for Women employees is 10.

--3. What is the series id for women employees in the commercial banking industry in the
--financial activities supersector?

SELECT TOP 10
s.series_id,
s.data_type_code,
i.industry_code,
i.industry_name
FROM LaborStatisticsDB.dbo.series s
JOIN LaborStatisticsDB.dbo.industry i
ON s.industry_code = i.industry_code
WHERE s.data_type_code = '10'
AND i.industry_name LIKE '%Commercial Banking%';

--Answer- 2 rows affected

--## Aggregate Your Friends and Code some SQL
--1. How many employees were reported in 2016 in all industries? Round to the nearest whole number.

SELECT TOP 10 *
FROM LaborStatisticsDB.dbo.series

SELECT
ROUND(SUM(value), 0) AS total_employees_2016
FROM LaborStatisticsDB.dbo.annual_2016
WHERE year = 2016;

--Answer- 2351408916

--2. How many women employees were reported in 2016 in all industries?
--Round to the nearest whole number.

SELECT
ROUND(SUM(value), 0) AS total_women_employees_2016
FROM LaborStatisticsDB.dbo.annual_2016
WHERE series_id = 'CES0000000010';

--Answer- NULL


--3. How many production/nonsupervisory employees were reported in 2016?
-- Round to the nearest whole number.

SELECT TOP 10 *
FROM LaborStatisticsDB.dbo.series
where series_title = 'Production and nonsupervisory employees '

SELECT
ROUND(SUM(value), 0) AS Production_nonsupervisory_employees
FROM LaborStatisticsDB.dbo.annual_2016
WHERE series_id = 'CES0600000006';

--Answer- NULL


--4. In January 2017, what is the average weekly hours worked by production and nonsupervisory
--employees across all industries?

SELECT
ROUND(AVG(value), 2) AS avg_weekly_hours_jan2017
FROM LaborStatisticsDB.dbo.january_2017
WHERE series_id = 'CES0600000006'
AND month = 1;

SELECT *
FROM LaborStatisticsDB.dbo.[period]
WHERE [YEAR]='2017'
GROUP BY january_2017.[period]

SELECT
ROUND(AVG(value), 2) AS avg_weekly_hours_jan2017
FROM LaborStatisticsDB.dbo.january_2017
WHERE series_id = 'CES0600000007' -- avg weekly hours series_id
GROUP BY [period]
--answer- 41.1

--5. What is the total weekly payroll for production and nonsupervisory employees across all
-- industries in January 2017? Round to the nearest penny.

SELECT TOP 10 *
FROM LaborStatisticsDB.dbo.series


WITH employees AS (
SELECT SUM(value) AS total_employees
FROM LaborStatisticsDB.dbo.january_2017
WHERE series_id = 'CES0500000006' AND period = 'M01'
),
avg_hours AS (
SELECT AVG(value) AS avg_weekly_hours
FROM LaborStatisticsDB.dbo.january_2017
WHERE series_id = 'CES0500000007' AND period = 'M01'
),
avg_earnings AS (
SELECT AVG(value) AS avg_hourly_earnings
FROM LaborStatisticsDB.dbo.january_2017
WHERE series_id = 'CES0500000008' AND period = 'M01'
)
SELECT
ROUND(
(SELECT total_employees FROM employees) *
(SELECT avg_weekly_hours FROM avg_hours) *
(SELECT avg_hourly_earnings FROM avg_earnings),
2) AS total_weekly_payroll_jan2017;

--Answer- 148996351.39

--6. In January 2017, for which industry was the average weekly hours worked by production and nonsupervisory employees the highest?
--Which industry was the lowest?

SELECT TOP 1
i.industry_name,
j.value AS avg_weekly_hours
FROM LaborStatisticsDB.dbo.january_2017 j
JOIN LaborStatisticsDB.dbo.series s
ON j.series_id = s.series_id
JOIN LaborStatisticsDB.dbo.industry i
ON s.industry_code = i.industry_code
WHERE j.series_id = 'CES0500000007'
AND j.period = 'M01'
ORDER BY j.value DESC;

--lowest
SELECT TOP 1
i.industry_name,
j.value AS avg_weekly_hours
FROM LaborStatisticsDB.dbo.january_2017 j
JOIN LaborStatisticsDB.dbo.series s
ON j.series_id = s.series_id
JOIN LaborStatisticsDB.dbo.industry i
ON s.industry_code = i.industry_code
WHERE j.series_id = 'CES0500000007'
AND j.period = 'M01'
ORDER BY j.value ASC;

--Answer- Total private = 33.6

--## Join in on the Fun
--Time to start joining! You can choose the type of join you use, just make sure to make a note!

--1. Join `annual_2016` with `series` on `series_id`. We only want the data in the `annual_2016`
--table to be included in the result.

SELECT
a.*,
s.series_title,
s.industry_code,
s.data_type_code
FROM LaborStatisticsDB.dbo.annual_2016 AS a
LEFT JOIN LaborStatisticsDB.dbo.series AS s
ON a.series_id = s.series_id;

--Answer- 29042 rows affected

SELECT TOP 50
a.*,
s.series_title,
s.industry_code,
s.data_type_code
FROM LaborStatisticsDB.dbo.annual_2016 AS a
LEFT JOIN LaborStatisticsDB.dbo.series AS s
ON a.series_id = s.series_id
ORDER BY a.id;

--Answer- 50 rows affected

--2. Join `series` and `datatype` on `data_type_code`.
-- Limiting rows returned from query, uncomment the line below to start on your query!
-- SELECT TOP 50 *
-- Uncomment the line below when you are ready to run the query, leaving it as your last!
-- ORDER BY id

SELECT TOP 50
s.series_id,
s.series_title,
s.data_type_code,
d.data_type_text
FROM LaborStatisticsDB.dbo.series AS s
LEFT JOIN LaborStatisticsDB.dbo.datatype AS d
ON s.data_type_code = d.data_type_code
ORDER BY s.series_id;

--Answer- 50 rows affected

--3. Join `series` and `industry` on `industry_code`.

-- Limiting rows returned from query, uncomment the line below to start on your query!
-- SELECT TOP 50 *
-- Uncomment the line below when you are ready to run the query, leaving it as your last!
-- ORDER BY id

SELECT TOP 50
s.series_id,
s.series_title,
s.industry_code,
i.industry_name
FROM LaborStatisticsDB.dbo.series AS s
LEFT JOIN LaborStatisticsDB.dbo.industry AS i
ON s.industry_code = i.industry_code
ORDER BY s.series_id;

--Answer- 50 rows affected

--## Subqueries, Unions, Derived Tables, Oh My!

--1. Write a query that returns the `series_id`, `industry_code`, `industry_name`, and `value` from the `january_2017` table but only if that value is greater than the average
--value for `annual_2016` of `data_type_code` 82.

SELECT
j.series_id,
s.industry_code,
i.industry_name,
j.value
FROM LaborStatisticsDB.dbo.january_2017 AS j
LEFT JOIN LaborStatisticsDB.dbo.series AS s
ON j.series_id = s.series_id
LEFT JOIN LaborStatisticsDB.dbo.industry AS i
ON s.industry_code = i.industry_code
WHERE j.value > (
SELECT AVG(a.value)
FROM LaborStatisticsDB.dbo.annual_2016 AS a
LEFT JOIN LaborStatisticsDB.dbo.series AS s2
ON a.series_id = s2.series_id
WHERE s2.data_type_code = '82'
);

--Answer- 754 rows affected

--1. During which time period did production and nonsupervisory employees fare better?

SELECT [value], [period]
FROM LaborStatisticsDB.dbo.january_2017
WHERE series_id IN (select series_id from LaborStatisticsDB.dbo.series where series_title like '%production and nonsupervisory employees%')
AND [value] = (SELECT MAX([value]) from LaborStatisticsDB.dbo.january_2017)

SELECT value, period
FROM LaborStatisticsDB.dbo.january_2017
HAVING [value] IN (SELECT MAX([value]) where series_id=(Select series_id where series_title like '%production and nonsupervisory employees%'))

Select TOP 10*
FROM LaborStatisticsDB.dbo.series


SELECT top 10*
from LaborStatisticsDB.dbo.[period]

--3. Now that you have explored the datasets, is there any data or information that you wish you had in this analysis?

--Answer-Yes. While the data was helpful, a few more things would have made the analysis better:

--Information by age, gender, or location This would help us see if all groups of workers benefited the same way.

--Cost of living or inflation Just knowing wages is not enough. It would be helpful to know if wages actually kept up with rising prices.

--Benefits or overtime pay This would show the full income employees are getting, not just their regular pay.

--Unemployment or job loss data This would help us understand how stable jobs were during that time.



Binary file added DataSet_project.zip
Binary file not shown.
Loading