Avail ICT110 Introduction to Data Science Assignment Help At Affordable Prices!!

Home   Course   University of the Sunshine Coast Assignment Help
Previous << || >> Next

ORDER NEW COPY OF ICT110 INTRODUCTION TO DATA SCIENCE ASSIGNMENT AND SECURE HIGHER MARKS!

ICT110 Introduction to Data Science - University of the Sunshine Coast

Data analysis report of the health and population statistics of East Asian and Pacific countries

NEVER BE CAUGHT IN PLAGIARISM, AVAIL ICT110 INTRODUCTION TO DATA SCIENCE ASSIGNMENT HELP SERVICE OF EXPERTSMINDS.COM AND SAVE HIGHER MARKS!

1. Introduction

Provide an introduction to the problem. Include background material as appropriate: who cares about this problem, what impact it has, where does the data come from, what are the dimensions and structures of the data.

1.1 Authorization and Purpose

The data taken into consideration represents the initial basic information about road transport crash deaths in Australia and it was consolidated based on the reports generated by the police each month to the State and Territory road safety authorities

1.2 Limitations

Here, the study is limited to only Australian population and hence the interpretation will be valid to the population and characteristics of Australian people and cannot be used for general population

1.3  Scope

The major objective of this study is to determine the factors that influence the road death accidents. Also, we try to determine the distribution of road death accidents which helps to understand the highest frequency of death rates along with its time and speed limit

1.4 Methodology

The data was gathered in csv format and the analysis was performed using R software package. Descriptive statistics was used to understand the distribution of interval or ratio variables and frequency distribution was normally used for qualitative variables which we use to understand the shape of these variables. Independent sample t test, One way ANOVA and Chi - square test for independence was used to test the claim

EXPERTSMINDS.COM GIVES ACCOUNTABILITY OF YOUR TIME AND MONEY - AVAIL TOP RESULTS ORIGINATED ICT110 INTRODUCTION TO DATA SCIENCE ASSIGNMENT HELP SERVICES AT BEST RATES!

2. Data Setup

Describe how to load the data, and how the pre-processing is performed.

The data cleaning procedure was performed by the researcher or the staff who are well trained in this process. Data cleaning is a process of identifying inconsistent records that deviate far away from the normal dataset value and therefore, these values or variables needs to be removed from the dataset before performing the statistical analysis. For example, let us consider the obesity data. Here, some of the patients demographic details seems to be missing and therefore, these missing values cannot be replaced as it related to individuals personal details.  Therefore, we need to replace the missing information carefully. For the questions related with likert scale ratings, we can replace the missing information either by the median or mode value. For variables, the missing information can be replaced by the average values

ENROL WITH ICT110 INTRODUCTION TO DATA SCIENCE ASSIGNMENT HELP AND HOMEWORK WRITING SERVICES OF EXPERTSMINDS.COM AND GET BETTER RESULTS IN ICT110 INTRODUCTION TO DATA SCIENCE ASSIGNMENTS!

3 Exploratory Data analysis

Perform 2 one-variable analyses. Plot one graph for each variable. Explain the finding for each graph.

3.1 One variable analysis

3.1.1 One variable analysis 1

mydata<-read.csv("D:/Downloads/crashdata.csv",header=T,sep=",")

attach(mydata)

head(mydata)

boxplot(Speed.Limit)

The box plot for speed limit variable is given below

ICT110 Introduction to Data Science.jpg

From the above box plot, we see that the distribution of speed limit follows normal distribution approximately. The middle line in the box plot called the median divides the box in equal halves, indicating that the distribution of box plot follows normal distribution

3.1.2 One variable analysis 2

The distribution of crash type is given below

plot(Crash.Type)

ICT110 Introduction to Data Science1.jpg

The bar chart given above shows that the maximum number of crash type is either single or multiple. Thus, majority of the crash type fall between these two categories and therefore proper preventive action should be taken to minimize these kind of fatalities

3.1.3 One variable analysis 3

The time variable is categorized into eight groups and are given below

12:00 am - 3:00 am

3:00 am - 6:00 am

6:00 am - 9:00 am

9:00 am - 12:00 pm

12:00 pm - 3:00 pm

3:00 pm - 6:00 pm

6:00 pm - 9:00 pm

9:00 pm - 12:00 am

ICT110 Introduction to Data Science2.jpg

From the above bar chart, we see that the maximum number of accident occurred during the time of 9 AM and 12 PM which shows a clear indicating that the fatalities occurs during the peak hours of the day

3.2 Two-variable analysis

3.2.1 Two-variable analysis 1

The association between crash type and time is assess using clustered bar chart

ICT110 Introduction to Data Science3.jpg

3.2.2 Two-variable analysis 2

counts<- table(Crash.Type,Time3hours)

>barplot(counts)

>counts<- table(Crash.Type,Christmas.Period)

>barplot(counts)

ICT110 Introduction to Data Science4.jpg

Here, we see that most of the crash type fatalities are happening on non-Christmas days

24/7 AVAILABILITY OF TRUSTED ICT110 INTRODUCTION TO DATA SCIENCE ASSIGNMENT WRITERS! ORDER ASSIGNMENTS FOR BETTER RESULTS!

4 Advanced analysis

Briefly explain the concept of clustering and k-means.
Perform 1 clustering analysis to group years according to a selected cause.

4.1 Clustering

The k-means clustering is a statistical technique used in cluster analysis to partition n observations in to k clusters where each observations is said to be belong to the nearest mean clusters. On the other hand, hierarchical clustering is a method of cluster analysis which seeks to build a hierarchy of clusters. Therefore, to partitioning n observations in K non-overlapping clusters such that there is at least one observation in each cluster can be done in nCk ways

4.1.1 Brief explanation of k-means and clustering

ICT110 Introduction to Data Science5.jpg

4.2 Linear Regression

Briefly explain the concept of linear regression.
Perform 2 linear regression analysis. Plot the learned models.

4.2.1 Brief definition of Linear Regression

Simple linear regression analysis was performed to predict speed limit using bus accident as independent variable

ICT110 Introduction to Data Science6.jpg

SUMMARY OUTPUT













Regression Statistics






Multiple R

0.071177






R Square

0.005066






Adjusted R Square

0.005043






Standard Error

20.5181






Observations

43852













ANOVA







 

df

SS

MS

F

Significance F


Regression

1

93999.58

93999.58

223.281

2.32E-50


Residual

43850

18460515

420.9924




Total

43851

18554514

 

 

 









 

Coefficients

Standard Error

t Stat

P-value

Lower 95%

Upper 95%

Intercept

82.33147

0.098823

833.1203

0

82.13777

82.52516

Bus
Involvement

-11.3368

0.758694

-14.9426

2.32E-50

-12.8239

-9.84979

The coefficient of determination is 0.0056, indicating that 0.56% of the variation in speed limit is explained by the regression model, while the remaining left unexplained.

The regression equation is

Speed limit = 82.33 - 11.34 * Bus Involvement

4.2.1 Linear Regression Model 2

Simple linear regression analysis was performed to predict speed limit using crash type as independent variables

SUMMARY OUTPUT













Regression Statistics






Multiple R

0.333198






R Square

0.111021






Adjusted R Square

0.11098






Standard Error

19.39418






Observations

43861













ANOVA







 

df

SS

MS

F

Significance F


Regression

2

2060177

1030088

2738.619

0


Residual

43858

16496494

376.1342




Total

43860

18556670

 

 

 









 

Coefficients

Standard Error

t Stat

P-value

Lower 95%

Upper 95%

Intercept

87.23772

0.143139

609.4605

0

86.95716

87.51827

Pedestrian

-19.4699

0.264562

-73.5928

0

-19.9884

-18.9513

Multiple

-4.22375

0.203706

-20.7346

4.84E-95

-4.62302

-3.82449

The coefficient of determination is 0.111, indicating that 11.1% of the variation in speed limit is explained by the regression model, while the remaining left unexplained.

The regression equation is

Speed limit = 87.238 - 19.47 * Pedestrian - 4.22 * Multiple

5 Conclusion

The data taken into consideration represents the initial basic information about road transport crash deaths in Australia and it was consolidated based on the reports generated by the police each month to the State and Territory road safety authorities. Here, we see that the majority of the crash type fall between these two categories and therefore proper preventive action should be taken to minimize these kind of fatalities. Also, the maximum number of accident occurred during the time of 9 AM and 12 PM which shows a clear indicating that the fatalities occurs during the peak hours of the day

6 Reflection

The major objective of this study is to determine the factors that influence the road death accidents. Also, we try to determine the distribution of road death accidents which helps to understand the highest frequency of death rates along with its time and speed limit. Here, the study is limited to only Australian population and hence the interpretation will be valid to the population and characteristics of Australian people and cannot be used for general population.

GET ASSURED A++ GRADE IN EACH ICT110 INTRODUCTION TO DATA SCIENCE ASSIGNMENT ORDER - ORDER FOR ORIGINALLY WRITTEN SOLUTIONS!

Access our University of the Sunshine Coast Assignment Help Services for its related courses such as:-

  • ICT701 Relational Database Systems Assignment Help
  • ICT112 Creative Problem Solving with Programming assignment help
  • ICT705 Data and System Integration Assignment Help
  • ICT115 Introduction to Systems Design assignment help
  • ICT702 Data Wrangling Assignment Help
  • ICT120 Introduction to Computer Networks assignment help
  • ICT703 Network Technology and Management Assignment Help
  • ICT351 ICT Professional Practice assignment help
  • ICT707 Data Science Practice Assignment Help
Tag This :- EM201933ANK516OTH, ICT110 Introduction to Data Science Assignment Help

get assignment Quote

Assignment Samples

    MGMT20143 Think Big Assignment Help

    mgmt20143 think big assignment help, Central Queensland University, Australia - Prepare a business report that deconstructs an organisation's business model.

Get Academic Excellence with Best Skilled Tutor! Order Assignment Now! Submit Assignment