Wednesday, 25 June 2025

Data Management Assignment

1) Ther Program # -*- coding: utf-8 -*- """ Created on Wed Jun 25 14:21:19 2025 @author: Eden """ import pandas import numpy data = pandas.read_csv('nesarc_pds.csv', low_memory=False) print(len(data)) print(len(data.columns)) print("Counts for TAB12MDX - Nicotine dependence in pas 12 Months, Yes=1") count1 = data["TAB12MDX"].value_counts(sort=False) print (count1) print("Percentages for TAB12MDX - Nicotine dependence in the past 12 Months, Yes=1") percentage1 = data["TAB12MDX"].value_counts(sort=False, normalize=True) print(percentage1) print("Counts for CHECK321 Smoked in the past 12 Months, Yes=1") count2 = data["CHECK321"].value_counts(sort=False) print (count2) print("Percentages for CHECK321 Smoked in the past 12 Months, Yes=1") percentage2 = data["CHECK321"].value_counts(sort=False, normalize=True) print(percentage2) print("Counts for S3AQ3B1 - Ususal Frequency When Smoked Cigaretes") count3 = data["S3AQ3B1"].value_counts(sort=False) print (count3) print("Percentages for S3AQ3B1 - Ususal Frequency When Smoked Cigaretes") percentage3 = data["S3AQ3B1"].value_counts(sort=False, normalize=True) print(percentage3) print("Counts for S3AQ3C1 - Ususal Quantity When Smoked Cigaretes") count4 = data["S3AQ3C1"].value_counts(sort=False) print (count4) print("Percentages for S3AQ3C1 - Ususal Quantity When Smoked Cigaretes") percentage4 = data["S3AQ3C1"].value_counts(sort=False, normalize=True) print(percentage4)

2) The Frequency Table OUTPUT
43093
3010
Counts for TAB12MDX - Nicotine dependence in pas 12 Months, Yes=1
TAB12MDX
0 38131
1 4962
Name: count, dtype: int64
Percentages for TAB12MDX - Nicotine dependence in the past 12 Months, Yes=1
TAB12MDX
0 0.884854
1 0.115146
Name: proportion, dtype: float64
Counts for CHECK321 Smoked in the past 12 Months, Yes=1
CHECK321
1.0 9913
2.0 8078
9.0 22
Name: count, dtype: int64
Percentages for CHECK321 Smoked in the past 12 Months, Yes=1
CHECK321
1.0 0.550325
2.0 0.448454
9.0 0.001221
Name: proportion, dtype: float64
Counts for S3AQ3B1 - Ususal Frequency When Smoked Cigaretes
S3AQ3B1
1.0 14836
5.0 409
4.0 747
3.0 687
2.0 460
9.0 102
6.0 772
Name: count, dtype: int64
Percentages for S3AQ3B1 - Ususal Frequency When Smoked Cigaretes
S3AQ3B1
1.0 0.823627
5.0 0.022706
4.0 0.041470
3.0 0.038139
2.0 0.025537
9.0 0.005663
6.0 0.042858
Name: proportion, dtype: float64
Counts for S3AQ3C1 - Ususal Quantity When Smoked Cigaretes
S3AQ3C1
20.0 5366
5.0 1070
2.0 884
10.0 3077
3.0 923
99.0 262
1.0 934
40.0 993
7.0 269
4.0 573
15.0 851
30.0 909
6.0 463
8.0 299
13.0 34
98.0 15
14.0 25
25.0 155
60.0 241
12.0 230
16.0 40
50.0 106
45.0 8
9.0 49
18.0 59
35.0 30
21.0 1
11.0 23
80.0 47
66.0 1
17.0 22
39.0 1
27.0 2
29.0 3
24.0 7
70.0 12
22.0 10
37.0 2
23.0 2
34.0 1
55.0 2
57.0 1
75.0 2
33.0 1
28.0 3
Name: count, dtype: int64
Percentages for S3AQ3C1 - Ususal Quantity When Smoked Cigaretes
S3AQ3C1
20.0 0.297896
5.0 0.059402
2.0 0.049076
10.0 0.170821
3.0 0.051241
99.0 0.014545
1.0 0.051851
40.0 0.055127
7.0 0.014934
4.0 0.031810
15.0 0.047244
30.0 0.050464
6.0 0.025704
8.0 0.016599
13.0 0.001888
98.0 0.000833
14.0 0.001388
25.0 0.008605
60.0 0.013379
12.0 0.012769
16.0 0.002221
50.0 0.005885
45.0 0.000444
9.0 0.002720
18.0 0.003275
35.0 0.001665
21.0 0.000056
11.0 0.001277
80.0 0.002609
66.0 0.000056
17.0 0.001221
19.0 0.000278
39.0 0.000056
27.0 0.000111
29.0 0.000167
24.0 0.000389
70.0 0.000666
22.0 0.000555
37.0 0.000111
23.0 0.000111
34.0 0.000056
55.0 0.000111
57.0 0.000056
75.0 0.000111
33.0 0.000056
28.0 0.000167


3) Describing Frequency Distributions
TAB12MDX (Nicotine dependence in past 12 Months, Yes=1): This variable indicates nicotine dependence. The data shows that out of 43,093 observations, the majority, 38,131 individuals (approximately 88.49%), reported not having nicotine dependence. Only 4,962 individuals (about 11.51%) reported experiencing nicotine dependence.

CHECK321 (Smoked in the past 12 Months, Yes=1): This variable shows recent smoking behavior. The most frequent response is 1.0 with 9,913 counts (approximately 55.03%), indicating that a majority of respondents in this subset smoked in the past 12 months.

S3AQ3B1 (Usual Frequency When Smoked Cigarettes): This variable indicates the frequency of cigarette smoking. The value 1.0 is the most common, reported by 14,836 individuals (approximately 82.36%). Other reported frequencies include 6.0 (772 counts, about 4.29%), 4.0 (747 counts, about 4.15%), 3.0 (687 counts, about 3.81%), 2.0 (460 counts, about 2.55%), and 5.0 (409 counts, about 2.27%).

S3AQC1 (Usual Quantity When Smoked Cigarettes) This variable indicates quantities of cigarettes smoked. The most common quantity reported is 20.0, with 5,366 counts, representing nearly 29.79% of the responses. This suggests that smoking 20 cigarettes (a pack) is the most typical daily quantity among the respondents.

No comments:

Post a Comment