Have a language expert improve your writing
Run a free plagiarism check in 10 minutes, generate accurate citations for free.
- Knowledge Base
The Beginner's Guide to Statistical Analysis | 5 Steps & Examples
Statistical analysis means investigating trends, patterns, and relationships using quantitative data . It is an important research tool used by scientists, governments, businesses, and other organizations.
To draw valid conclusions, statistical analysis requires careful planning from the very start of the research process . You need to specify your hypotheses and make decisions about your research design, sample size, and sampling procedure.
After collecting data from your sample, you can organize and summarize the data using descriptive statistics . Then, you can use inferential statistics to formally test hypotheses and make estimates about the population. Finally, you can interpret and generalize your findings.
This article is a practical introduction to statistical analysis for students and researchers. We’ll walk you through the steps using two research examples. The first investigates a potential cause-and-effect relationship, while the second investigates a potential correlation between variables.
Table of contents
Step 1: write your hypotheses and plan your research design, step 2: collect data from a sample, step 3: summarize your data with descriptive statistics, step 4: test hypotheses or make estimates with inferential statistics, step 5: interpret your results.
To collect valid data for statistical analysis, you first need to specify your hypotheses and plan out your research design.
Writing statistical hypotheses
The goal of research is often to investigate a relationship between variables within a population . You start with a prediction, and use statistical analysis to test that prediction.
A statistical hypothesis is a formal way of writing a prediction about a population. Every research prediction is rephrased into null and alternative hypotheses that can be tested using sample data.
While the null hypothesis always predicts no effect or no relationship between variables, the alternative hypothesis states your research prediction of an effect or relationship.
- Null hypothesis: A 5-minute meditation exercise will have no effect on math test scores in teenagers.
- Alternative hypothesis: A 5-minute meditation exercise will improve math test scores in teenagers.
- Null hypothesis: Parental income and GPA have no relationship with each other in college students.
- Alternative hypothesis: Parental income and GPA are positively correlated in college students.
Planning your research design
A research design is your overall strategy for data collection and analysis. It determines the statistical tests you can use to test your hypothesis later on.
First, decide whether your research will use a descriptive, correlational, or experimental design. Experiments directly influence variables, whereas descriptive and correlational studies only measure variables.
- In an experimental design , you can assess a cause-and-effect relationship (e.g., the effect of meditation on test scores) using statistical tests of comparison or regression.
- In a correlational design , you can explore relationships between variables (e.g., parental income and GPA) without any assumption of causality using correlation coefficients and significance tests.
- In a descriptive design , you can study the characteristics of a population or phenomenon (e.g., the prevalence of anxiety in U.S. college students) using statistical tests to draw inferences from sample data.
Your research design also concerns whether you’ll compare participants at the group level or individual level, or both.
- In a between-subjects design , you compare the group-level outcomes of participants who have been exposed to different treatments (e.g., those who performed a meditation exercise vs those who didn’t).
- In a within-subjects design , you compare repeated measures from participants who have participated in all treatments of a study (e.g., scores from before and after performing a meditation exercise).
- In a mixed (factorial) design , one variable is altered between subjects and another is altered within subjects (e.g., pretest and posttest scores from participants who either did or didn’t do a meditation exercise).
First, you’ll take baseline test scores from participants. Then, your participants will undergo a 5-minute meditation exercise. Finally, you’ll record participants’ scores from a second math test.
In this experiment, the independent variable is the 5-minute meditation exercise, and the dependent variable is the math test score from before and after the intervention. Example: Correlational research design In a correlational study, you test whether there is a relationship between parental income and GPA in graduating college students. To collect your data, you will ask participants to fill in a survey and self-report their parents’ incomes and their own GPA.
When planning a research design, you should operationalize your variables and decide exactly how you will measure them.
For statistical analysis, it’s important to consider the level of measurement of your variables, which tells you what kind of data they contain:
- Categorical data represents groupings. These may be nominal (e.g., gender) or ordinal (e.g. level of language ability).
- Quantitative data represents amounts. These may be on an interval scale (e.g. test score) or a ratio scale (e.g. age).
Many variables can be measured at different levels of precision. For example, age data can be quantitative (8 years old) or categorical (young). If a variable is coded numerically (e.g., level of agreement from 1–5), it doesn’t automatically mean that it’s quantitative instead of categorical.
Identifying the measurement level is important for choosing appropriate statistics and hypothesis tests. For example, you can calculate a mean score with quantitative data, but not with categorical data.
In a research study, along with measures of your variables of interest, you’ll often collect data on relevant participant characteristics.
In most cases, it’s too difficult or expensive to collect data from every member of the population you’re interested in studying. Instead, you’ll collect data from a sample.
Statistical analysis allows you to apply your findings beyond your own sample as long as you use appropriate sampling procedures . You should aim for a sample that is representative of the population.
Sampling for statistical analysis
There are two main approaches to selecting a sample.
- Probability sampling: every member of the population has a chance of being selected for the study through random selection.
- Non-probability sampling: some members of the population are more likely than others to be selected for the study because of criteria such as convenience or voluntary self-selection.
In theory, for highly generalizable findings, you should use a probability sampling method. Random selection reduces several types of research bias , like sampling bias , and ensures that data from your sample is actually typical of the population. Parametric tests can be used to make strong statistical inferences when data are collected using probability sampling.
But in practice, it’s rarely possible to gather the ideal sample. While non-probability samples are more likely to at risk for biases like self-selection bias , they are much easier to recruit and collect data from. Non-parametric tests are more appropriate for non-probability samples, but they result in weaker inferences about the population.
If you want to use parametric tests for non-probability samples, you have to make the case that:
- your sample is representative of the population you’re generalizing your findings to.
- your sample lacks systematic bias.
Keep in mind that external validity means that you can only generalize your conclusions to others who share the characteristics of your sample. For instance, results from Western, Educated, Industrialized, Rich and Democratic samples (e.g., college students in the US) aren’t automatically applicable to all non-WEIRD populations.
If you apply parametric tests to data from non-probability samples, be sure to elaborate on the limitations of how far your results can be generalized in your discussion section .
Create an appropriate sampling procedure
Based on the resources available for your research, decide on how you’ll recruit participants.
- Will you have resources to advertise your study widely, including outside of your university setting?
- Will you have the means to recruit a diverse sample that represents a broad population?
- Do you have time to contact and follow up with members of hard-to-reach groups?
Your participants are self-selected by their schools. Although you’re using a non-probability sample, you aim for a diverse and representative sample. Example: Sampling (correlational study) Your main population of interest is male college students in the US. Using social media advertising, you recruit senior-year male college students from a smaller subpopulation: seven universities in the Boston area.
Calculate sufficient sample size
Before recruiting participants, decide on your sample size either by looking at other studies in your field or using statistics. A sample that’s too small may be unrepresentative of the sample, while a sample that’s too large will be more costly than necessary.
There are many sample size calculators online. Different formulas are used depending on whether you have subgroups or how rigorous your study should be (e.g., in clinical research). As a rule of thumb, a minimum of 30 units or more per subgroup is necessary.
To use these calculators, you have to understand and input these key components:
- Significance level (alpha): the risk of rejecting a true null hypothesis that you are willing to take, usually set at 5%.
- Statistical power : the probability of your study detecting an effect of a certain size if there is one, usually 80% or higher.
- Expected effect size : a standardized indication of how large the expected result of your study will be, usually based on other similar studies.
- Population standard deviation: an estimate of the population parameter based on a previous study or a pilot study of your own.
Receive feedback on language, structure, and formatting
Professional editors proofread and edit your paper by focusing on:
- Academic style
- Vague sentences
- Style consistency
See an example
Once you’ve collected all of your data, you can inspect them and calculate descriptive statistics that summarize them.
Inspect your data
There are various ways to inspect your data, including the following:
- Organizing data from each variable in frequency distribution tables .
- Displaying data from a key variable in a bar chart to view the distribution of responses.
- Visualizing the relationship between two variables using a scatter plot .
By visualizing your data in tables and graphs, you can assess whether your data follow a skewed or normal distribution and whether there are any outliers or missing data.
A normal distribution means that your data are symmetrically distributed around a center where most values lie, with the values tapering off at the tail ends.
In contrast, a skewed distribution is asymmetric and has more values on one end than the other. The shape of the distribution is important to keep in mind because only some descriptive statistics should be used with skewed distributions.
Extreme outliers can also produce misleading statistics, so you may need a systematic approach to dealing with these values.
Calculate measures of central tendency
Measures of central tendency describe where most of the values in a data set lie. Three main measures of central tendency are often reported:
- Mode : the most popular response or value in the data set.
- Median : the value in the exact middle of the data set when ordered from low to high.
- Mean : the sum of all values divided by the number of values.
However, depending on the shape of the distribution and level of measurement, only one or two of these measures may be appropriate. For example, many demographic characteristics can only be described using the mode or proportions, while a variable like reaction time may not have a mode at all.
Calculate measures of variability
Measures of variability tell you how spread out the values in a data set are. Four main measures of variability are often reported:
- Range : the highest value minus the lowest value of the data set.
- Interquartile range : the range of the middle half of the data set.
- Standard deviation : the average distance between each value in your data set and the mean.
- Variance : the square of the standard deviation.
Once again, the shape of the distribution and level of measurement should guide your choice of variability statistics. The interquartile range is the best measure for skewed distributions, while standard deviation and variance provide the best information for normal distributions.
Using your table, you should check whether the units of the descriptive statistics are comparable for pretest and posttest scores. For example, are the variance levels similar across the groups? Are there any extreme values? If there are, you may need to identify and remove extreme outliers in your data set or transform your data before performing a statistical test.
From this table, we can see that the mean score increased after the meditation exercise, and the variances of the two scores are comparable. Next, we can perform a statistical test to find out if this improvement in test scores is statistically significant in the population. Example: Descriptive statistics (correlational study) After collecting data from 653 students, you tabulate descriptive statistics for annual parental income and GPA.
It’s important to check whether you have a broad range of data points. If you don’t, your data may be skewed towards some groups more than others (e.g., high academic achievers), and only limited inferences can be made about a relationship.
A number that describes a sample is called a statistic , while a number describing a population is called a parameter . Using inferential statistics , you can make conclusions about population parameters based on sample statistics.
Researchers often use two main methods (simultaneously) to make inferences in statistics.
- Estimation: calculating population parameters based on sample statistics.
- Hypothesis testing: a formal process for testing research predictions about the population using samples.
You can make two types of estimates of population parameters from sample statistics:
- A point estimate : a value that represents your best guess of the exact parameter.
- An interval estimate : a range of values that represent your best guess of where the parameter lies.
If your aim is to infer and report population characteristics from sample data, it’s best to use both point and interval estimates in your paper.
You can consider a sample statistic a point estimate for the population parameter when you have a representative sample (e.g., in a wide public opinion poll, the proportion of a sample that supports the current government is taken as the population proportion of government supporters).
There’s always error involved in estimation, so you should also provide a confidence interval as an interval estimate to show the variability around a point estimate.
A confidence interval uses the standard error and the z score from the standard normal distribution to convey where you’d generally expect to find the population parameter most of the time.
Using data from a sample, you can test hypotheses about relationships between variables in the population. Hypothesis testing starts with the assumption that the null hypothesis is true in the population, and you use statistical tests to assess whether the null hypothesis can be rejected or not.
Statistical tests determine where your sample data would lie on an expected distribution of sample data if the null hypothesis were true. These tests give two main outputs:
- A test statistic tells you how much your data differs from the null hypothesis of the test.
- A p value tells you the likelihood of obtaining your results if the null hypothesis is actually true in the population.
Statistical tests come in three main varieties:
- Comparison tests assess group differences in outcomes.
- Regression tests assess cause-and-effect relationships between variables.
- Correlation tests assess relationships between variables without assuming causation.
Your choice of statistical test depends on your research questions, research design, sampling method, and data characteristics.
Parametric tests make powerful inferences about the population based on sample data. But to use them, some assumptions must be met, and only some types of variables can be used. If your data violate these assumptions, you can perform appropriate data transformations or use alternative non-parametric tests instead.
A regression models the extent to which changes in a predictor variable results in changes in outcome variable(s).
- A simple linear regression includes one predictor variable and one outcome variable.
- A multiple linear regression includes two or more predictor variables and one outcome variable.
Comparison tests usually compare the means of groups. These may be the means of different groups within a sample (e.g., a treatment and control group), the means of one sample group taken at different times (e.g., pretest and posttest scores), or a sample mean and a population mean.
- A t test is for exactly 1 or 2 groups when the sample is small (30 or less).
- A z test is for exactly 1 or 2 groups when the sample is large.
- An ANOVA is for 3 or more groups.
The z and t tests have subtypes based on the number and types of samples and the hypotheses:
- If you have only one sample that you want to compare to a population mean, use a one-sample test .
- If you have paired measurements (within-subjects design), use a dependent (paired) samples test .
- If you have completely separate measurements from two unmatched groups (between-subjects design), use an independent (unpaired) samples test .
- If you expect a difference between groups in a specific direction, use a one-tailed test .
- If you don’t have any expectations for the direction of a difference between groups, use a two-tailed test .
The only parametric correlation test is Pearson’s r . The correlation coefficient ( r ) tells you the strength of a linear relationship between two quantitative variables.
However, to test whether the correlation in the sample is strong enough to be important in the population, you also need to perform a significance test of the correlation coefficient, usually a t test, to obtain a p value. This test uses your sample size to calculate how much the correlation coefficient differs from zero in the population.
You use a dependent-samples, one-tailed t test to assess whether the meditation exercise significantly improved math test scores. The test gives you:
- a t value (test statistic) of 3.00
- a p value of 0.0028
Although Pearson’s r is a test statistic, it doesn’t tell you anything about how significant the correlation is in the population. You also need to test whether this sample correlation coefficient is large enough to demonstrate a correlation in the population.
A t test can also determine how significantly a correlation coefficient differs from zero based on sample size. Since you expect a positive correlation between parental income and GPA, you use a one-sample, one-tailed t test. The t test gives you:
- a t value of 3.08
- a p value of 0.001
The final step of statistical analysis is interpreting your results.
In hypothesis testing, statistical significance is the main criterion for forming conclusions. You compare your p value to a set significance level (usually 0.05) to decide whether your results are statistically significant or non-significant.
Statistically significant results are considered unlikely to have arisen solely due to chance. There is only a very low chance of such a result occurring if the null hypothesis is true in the population.
This means that you believe the meditation intervention, rather than random factors, directly caused the increase in test scores. Example: Interpret your results (correlational study) You compare your p value of 0.001 to your significance threshold of 0.05. With a p value under this threshold, you can reject the null hypothesis. This indicates a statistically significant correlation between parental income and GPA in male college students.
Note that correlation doesn’t always mean causation, because there are often many underlying factors contributing to a complex variable like GPA. Even if one variable is related to another, this may be because of a third variable influencing both of them, or indirect links between the two variables.
A statistically significant result doesn’t necessarily mean that there are important real life applications or clinical outcomes for a finding.
In contrast, the effect size indicates the practical significance of your results. It’s important to report effect sizes along with your inferential statistics for a complete picture of your results. You should also report interval estimates of effect sizes if you’re writing an APA style paper .
With a Cohen’s d of 0.72, there’s medium to high practical significance to your finding that the meditation exercise improved test scores. Example: Effect size (correlational study) To determine the effect size of the correlation coefficient, you compare your Pearson’s r value to Cohen’s effect size criteria.
Type I and Type II errors are mistakes made in research conclusions. A Type I error means rejecting the null hypothesis when it’s actually true, while a Type II error means failing to reject the null hypothesis when it’s false.
You can aim to minimize the risk of these errors by selecting an optimal significance level and ensuring high power . However, there’s a trade-off between the two errors, so a fine balance is necessary.
Frequentist versus Bayesian statistics
Traditionally, frequentist statistics emphasizes null hypothesis significance testing and always starts with the assumption of a true null hypothesis.
However, Bayesian statistics has grown in popularity as an alternative approach in the last few decades. In this approach, you use previous research to continually update your hypotheses based on your expectations and observations.
Bayes factor compares the relative strength of evidence for the null versus the alternative hypothesis rather than making a conclusion about rejecting the null hypothesis or not.
Is this article helpful?
Other students also liked.
- Descriptive Statistics | Definitions, Types, Examples
- Inferential Statistics | An Easy Introduction & Examples
- Choosing the Right Statistical Test | Types & Examples
More interesting articles
- Akaike Information Criterion | When & How to Use It (Example)
- An Easy Introduction to Statistical Significance (With Examples)
- An Introduction to t Tests | Definitions, Formula and Examples
- ANOVA in R | A Complete Step-by-Step Guide with Examples
- Central Limit Theorem | Formula, Definition & Examples
- Central Tendency | Understanding the Mean, Median & Mode
- Chi-Square (Χ²) Distributions | Definition & Examples
- Chi-Square (Χ²) Table | Examples & Downloadable Table
- Chi-Square (Χ²) Tests | Types, Formula & Examples
- Chi-Square Goodness of Fit Test | Formula, Guide & Examples
- Chi-Square Test of Independence | Formula, Guide & Examples
- Coefficient of Determination (R²) | Calculation & Interpretation
- Correlation Coefficient | Types, Formulas & Examples
- Frequency Distribution | Tables, Types & Examples
- How to Calculate Standard Deviation (Guide) | Calculator & Examples
- How to Calculate Variance | Calculator, Analysis & Examples
- How to Find Degrees of Freedom | Definition & Formula
- How to Find Interquartile Range (IQR) | Calculator & Examples
- How to Find Outliers | 4 Ways with Examples & Explanation
- How to Find the Geometric Mean | Calculator & Formula
- How to Find the Mean | Definition, Examples & Calculator
- How to Find the Median | Definition, Examples & Calculator
- How to Find the Mode | Definition, Examples & Calculator
- How to Find the Range of a Data Set | Calculator & Formula
- Hypothesis Testing | A Step-by-Step Guide with Easy Examples
- Interval Data and How to Analyze It | Definitions & Examples
- Levels of Measurement | Nominal, Ordinal, Interval and Ratio
- Linear Regression in R | A Step-by-Step Guide & Examples
- Missing Data | Types, Explanation, & Imputation
- Multiple Linear Regression | A Quick Guide (Examples)
- Nominal Data | Definition, Examples, Data Collection & Analysis
- Normal Distribution | Examples, Formulas, & Uses
- Null and Alternative Hypotheses | Definitions & Examples
- One-way ANOVA | When and How to Use It (With Examples)
- Ordinal Data | Definition, Examples, Data Collection & Analysis
- Parameter vs Statistic | Definitions, Differences & Examples
- Pearson Correlation Coefficient (r) | Guide & Examples
- Poisson Distributions | Definition, Formula & Examples
- Probability Distribution | Formula, Types, & Examples
- Quartiles & Quantiles | Calculation, Definition & Interpretation
- Ratio Scales | Definition, Examples, & Data Analysis
- Simple Linear Regression | An Easy Introduction & Examples
- Skewness | Definition, Examples & Formula
- Statistical Power and Why It Matters | A Simple Introduction
- Student's t Table (Free Download) | Guide & Examples
- T-distribution: What it is and how to use it
- Test statistics | Definition, Interpretation, and Examples
- The Standard Normal Distribution | Calculator, Examples & Uses
- Two-Way ANOVA | Examples & When To Use It
- Type I & Type II Errors | Differences, Examples, Visualizations
- Understanding Confidence Intervals | Easy Examples & Formulas
- Understanding P values | Definition and Examples
- Variability | Calculating Range, IQR, Variance, Standard Deviation
- What is Effect Size and Why Does It Matter? (Examples)
- What Is Kurtosis? | Definition, Examples & Formula
- What Is Standard Error? | How to Calculate (Guide with Examples)
What is your plagiarism score?
- Online Degree Explore Bachelor’s & Master’s degrees
- MasterTrack™ Earn credit towards a Master’s degree
- University Certificates Advance your career with graduate-level learning
- Top Courses
- Join for Free
What Is Data Analysis? (With Examples)
Data analysis is the practice of working with data to glean useful information, which can then be used to make informed decisions.
"It is a capital mistake to theorize before one has data. Insensibly one begins to twist facts to suit theories, instead of theories to suit facts," Sherlock Holme's proclaims in Sir Arthur Conan Doyle's A Scandal in Bohemia.
This idea lies at the root of data analysis. When we can extract meaning from data, it empowers us to make better decisions. And we’re living in a time when we have more data than ever at our fingertips.
Companies are wisening up to the benefits of leveraging data. Data analysis can help a bank to personalize customer interactions, a health care system to predict future health needs, or an entertainment company to create the next big streaming hit.
The World Economic Forum Future of Jobs Report 2020 listed data analysts and scientists as the top emerging job, followed immediately by AI and machine learning specialists, and big data specialists [ 1 ]. In this article, you'll learn more about the data analysis process, different types of data analysis, and recommended courses to help you get started in this exciting field.
Read more: How to Become a Data Analyst (with or Without a Degree)
Data analysis process
As the data available to companies continues to grow both in amount and complexity, so too does the need for an effective and efficient process by which to harness the value of that data. The data analysis process typically moves through several iterative phases. Let’s take a closer look at each.
Identify the business question you’d like to answer. What problem is the company trying to solve? What do you need to measure, and how will you measure it?
Collect the raw data sets you’ll need to help you answer the identified question. Data collection might come from internal sources, like a company’s client relationship management (CRM) software, or from secondary sources, like government records or social media application programming interfaces (APIs).
Clean the data to prepare it for analysis. This often involves purging duplicate and anomalous data, reconciling inconsistencies, standardizing data structure and format, and dealing with white spaces and other syntax errors.
Analyze the data. By manipulating the data using various data analysis techniques and tools, you can begin to find trends, correlations, outliers, and variations that tell a story. During this stage, you might use data mining to discover patterns within databases or data visualization software to help transform data into an easy-to-understand graphical format.
Interpret the results of your analysis to see how well the data answered your original question. What recommendations can you make based on the data? What are the limitations to your conclusions?
Watch this video to hear what data analysis how Kevin, Director of Data Analytics at Google, defines data analysis.
4.6 (5,984 ratings)
290K Students Enrolled
Course 6 of 8 in the Google Data Analytics Professional Certificate
Learn more: What Does a Data Analyst Do? A Career Guide
Types of data analysis (with examples)
Data can be used to answer questions and support decisions in many different ways. To identify the best way to analyze your date, it can help to familiarize yourself with the four types of data analysis commonly used in the field.
In this section, we’ll take a look at each of these data analysis methods, along with an example of how each might be applied in the real world.
Google Data Analytics
This is your path to a career in data analytics. In this program, you’ll learn in-demand skills that will have you job-ready in less than 6 months. No degree or experience required.
1,447,513 already enrolled
Average time: 6 month(s)
Learn at your own pace
Skills you'll build:
Spreadsheet, Data Cleansing, Data Analysis, Data Visualization (DataViz), SQL, Questioning, Decision-Making, Problem Solving, Metadata, Data Collection, Data Ethics, Sample Size Determination, Data Integrity, Data Calculations, Data Aggregation, Tableau Software, Presentation, R Programming, R Markdown, Rstudio, Job portfolio, case study
Descriptive analysis tells us what happened. This type of analysis helps describe or summarize quantitative data by presenting statistics. For example, descriptive statistical analysis could show the distribution of sales across a group of employees and the average sales figure per employee.
Descriptive analysis answers the question, “what happened?”
If the descriptive analysis determines the “what,” diagnostic analysis determines the “why.” Let’s say a descriptive analysis shows an unusual influx of patients in a hospital. Drilling into the data further might reveal that many of these patients shared symptoms of a particular virus. This diagnostic analysis can help you determine that an infectious agent—the “why”—led to the influx of patients.
Diagnostic analysis answers the question, “why did it happen?”
So far, we’ve looked at types of analysis that examine and draw conclusions about the past. Predictive analytics uses data to form projections about the future. Using predictive analysis, you might notice that a given product has had its best sales during the months of September and October each year, leading you to predict a similar high point during the upcoming year.
Predictive analysis answers the question, “what might happen in the future?”
Prescriptive analysis takes all the insights gathered from the first three types of analysis and uses them to form recommendations for how a company should act. Using our previous example, this type of analysis might suggest a market plan to build on the success of the high sales months and harness new growth opportunities in the slower months.
Prescriptive analysis answers the question, “what should we do about it?”
This last type is where the concept of data-driven decision-making comes into play.
Read more : Advanced Analytics: Definition, Benefits, and Use Cases
What is data-driven decision-making (DDDM)?
Data-driven decision-making, sometimes abbreviated to DDDM), can be defined as the process of making strategic business decisions based on facts, data, and metrics instead of intuition, emotion, or observation.
This might sound obvious, but in practice, not all organizations are as data-driven as they could be. According to global management consulting firm McKinsey Global Institute, data-driven companies are better at acquiring new customers, maintaining customer loyalty, and achieving above-average profitability [ 2 ].
Get started with Coursera
If you’re interested in a career in the high-growth field of data analytics, you can begin building job-ready skills with the Google Data Analytics Professional Certificate . Prepare yourself for an entry-level job as you learn from Google employees — no experience or degree required. Once you finish, you can apply directly with more than 130 US employers (including Google).
Frequently asked questions (FAQ)
Where is data analytics used .
Just about any business or organization can use data analytics to help inform their decisions and boost their performance. Some of the most successful companies across a range of industries — from Amazon and Netflix to Starbucks and General Electric — integrate data into their business plans to improve their overall business performance.
What are the top skills for a data analyst?
Data analysis makes use of a range of analysis tools and technologies. Some of the top skills for data analysts include SQL, data visualization, statistical programming languages (like R and Python), machine learning, and spreadsheets.
Read : 7 In-Demand Data Analyst Skills to Get Hired in 2022
What is a data analyst job salary?
Data from Glassdoor indicates that the average salary for a data analyst in the United States is $95,867 as of July 2022 [ 3 ]. How much you make will depend on factors like your qualifications, experience, and location.
Do data analysts need to be good at math?
Data analytics tends to be less math-intensive than data science. While you probably won’t need to master any advanced mathematics, a foundation in basic math and statistical analysis can help set you up for success.
Learn more: Data Analyst vs. Data Scientist: What’s the Difference?
World Economic Forum. " The Future of Jobs Report 2020 , https://www.weforum.org/reports/the-future-of-jobs-report-2020." Accessed July 28, 2022.
McKinsey & Company. " Five facts: How customer analytics boosts corporate performance , https://www.mckinsey.com/business-functions/marketing-and-sales/our-insights/five-facts-how-customer-analytics-boosts-corporate-performance." Accessed July 28, 2022.
Glassdoor. " Data Analyst Salaries , https://www.glassdoor.com/Salaries/data-analyst-salary-SRCH_KO0,12.htm" Accessed July 28, 2022.
This content has been made available for informational purposes only. Learners are advised to conduct additional research to ensure that courses and other credentials pursued meet their personal, professional, and financial goals.
Develop career skills and credentials to stand out
- Build in demand career skills with experts from leading companies and universities
- Choose from over 8000 courses, hands-on projects, and certificate programs
- Learn on your terms with flexible schedules and on-demand courses
Start or advance your career.
- Google Data Analyst
- Google Digital Marketing & E-commerce Professional Certificate
- Google IT Automation with Python Professional Certificate
- Google IT Support
- Google Project Management
- Google UX Design
- Preparing for Google Cloud Certification: Cloud Architect
- IBM Cybersecurity Analyst
- IBM Data Analyst
- IBM Data Engineering
- IBM Data Science
- IBM Full Stack Cloud Developer
- IBM Machine Learning
- Intuit Bookkeeping
- Meta Front-End Developer
- DeepLearning.AI TensorFlow Developer Professional Certificate
- SAS Programmer Professional Certificate
- Launch your career
- Prepare for a certification
- Advance your career
- How to Identify Python Syntax Errors
- How to Catch Python Exceptions
- See all Programming Tutorials
Popular Courses and Certifications
- Free Courses
- Artificial Intelligence Courses
- Blockchain Courses
- Computer Science Courses
- Cursos Gratis
- Cybersecurity Courses
- Data Analysis Courses
- Data Science Courses
- English Speaking Courses
- Full Stack Web Development Courses
- Google Courses
- Human Resources Courses
- Learning English Courses
- Microsoft Excel Courses
- Product Management Courses
- Project Management Courses
- Python Courses
- SQL Courses
- Agile Certifications
- CAPM Certification
- CompTIA A+ Certification
- Data Analytics Certifications
- Scrum Master Certifications
- See all courses
Popular collections and articles
- Free online courses you can finish in a day
- Popular Free Courses
- Business Jobs
- Cybersecurity Jobs
- Entry-Level IT Jobs
- Data Analyst Interview Questions
- Data Analytics Projects
- How to Become a Data Analyst
- How to Become a Project Manager
- Project Manager Interview Questions
- Python Programming Skills
- Strength and Weakness in Interview
- What Does a Data Analyst Do
- What Does a Software Engineer Do
- What Is a Data Engineer
- What Is a Data Scientist
- What Is a Product Designer
- What Is a Scrum Master
- What Is a UX Researcher
- How to Get a PMP Certification
- PMI Certifications
- Popular Cybersecurity Certifications
- Popular SQL Certifications
- Read all Coursera Articles
Earn a degree or certificate online
- Google Professional Certificates
- Professional Certificates
- See all certificates
- Bachelor's Degrees
- Master's Degrees
- Computer Science Degrees
- Data Science Degrees
- MBA & Business Degrees
- Data Analytics Degrees
- Public Health Degrees
- Social Sciences Degrees
- Management Degrees
- BA vs BS Degree
- What is a Bachelor's Degree?
- 11 Good Study Habits to Develop
- How to Write a Letter of Recommendation
- 10 In-Demand Jobs You Can Get with a Business Degree
- Is a Master's in Computer Science Worth it?
- See all degree programs
- Coursera India
- Coursera UK
- Coursera Mexico
- What We Offer
- Coursera Plus
- MasterTrack® Certificates
- For Enterprise
- For Government
- Become a Partner
- Coronavirus Response
- Beta Testers
- Teaching Center
- Modern Slavery Statement
- Utility Menu
- Questions about Expos?
- Writing Support for Instructors
- How to Write a Comparative Analysis
Throughout your academic career, you'll be asked to write papers in which you compare and contrast two things: two texts, two theories, two historical figures, two scientific processes, and so on. "Classic" compare-and-contrast papers, in which you weight A and B equally, may be about two similar things that have crucial differences (two pesticides with different effects on the environment) or two similar things that have crucial differences, yet turn out to have surprising commonalities (two politicians with vastly different world views who voice unexpectedly similar perspectives on sexual harassment).
In the "lens" (or "keyhole") comparison, in which you weight A less heavily than B, you use A as a lens through which to view B. Just as looking through a pair of glasses changes the way you see an object, using A as a framework for understanding B changes the way you see B. Lens comparisons are useful for illuminating, critiquing, or challenging the stability of a thing that, before the analysis, seemed perfectly understood. Often, lens comparisons take time into account: earlier texts, events, or historical figures may illuminate later ones, and vice versa.
Faced with a daunting list of seemingly unrelated similarities and differences, you may feel confused about how to construct a paper that isn't just a mechanical exercise in which you first state all the features that A and B have in common, and then state all the ways in which A and B are different. Predictably, the thesis of such a paper is usually an assertion that A and B are very similar yet not so similar after all. To write a good compare-and-contrast paper, you must take your raw data—the similarities and differences you've observed—and make them cohere into a meaningful argument. Here are the five elements required.
Frame of Reference . This is the context within which you place the two things you plan to compare and contrast; it is the umbrella under which you have grouped them. The frame of reference may consist of an idea, theme, question, problem, or theory; a group of similar things from which you extract two for special attention; biographical or historical information. The best frames of reference are constructed from specific sources rather than your own thoughts or observations. Thus, in a paper comparing how two writers redefine social norms of masculinity, you would be better off quoting a sociologist on the topic of masculinity than spinning out potentially banal-sounding theories of your own. Most assignments tell you exactly what the frame of reference should be, and most courses supply sources for constructing it. If you encounter an assignment that fails to provide a frame of reference, you must come up with one on your own. A paper without such a context would have no angle on the material, no focus or frame for the writer to propose a meaningful argument.
Grounds for Comparison . Let's say you're writing a paper on global food distribution, and you've chosen to compare apples and oranges. Why these particular fruits? Why not pears and bananas? The rationale behind your choice, the grounds for comparison , lets your reader know why your choice is deliberate and meaningful, not random. For instance, in a paper asking how the "discourse of domesticity" has been used in the abortion debate, the grounds for comparison are obvious; the issue has two conflicting sides, pro-choice and pro-life. In a paper comparing the effects of acid rain on two forest sites, your choice of sites is less obvious. A paper focusing on similarly aged forest stands in Maine and the Catskills will be set up differently from one comparing a new forest stand in the White Mountains with an old forest in the same region. You need to indicate the reasoning behind your choice.
Thesis . The grounds for comparison anticipates the comparative nature of your thesis. As in any argumentative paper, your thesis statement will convey the gist of your argument, which necessarily follows from your frame of reference. But in a compare-and-contrast, the thesis depends on how the two things you've chosen to compare actually relate to one another. Do they extend, corroborate, complicate, contradict, correct, or debate one another? In the most common compare-and-contrast paper—one focusing on differences—you can indicate the precise relationship between A and B by using the word "whereas" in your thesis:
Whereas Camus perceives ideology as secondary to the need to address a specific historical moment of colonialism, Fanon perceives a revolutionary ideology as the impetus to reshape Algeria's history in a direction toward independence.
Whether your paper focuses primarily on difference or similarity, you need to make the relationship between A and B clear in your thesis. This relationship is at the heart of any compare-and-contrast paper.
Organizational Scheme . Your introduction will include your frame of reference, grounds for comparison, and thesis. There are two basic ways to organize the body of your paper.
- In text-by-text , you discuss all of A, then all of B.
- In point-by-point , you alternate points about A with comparable points about B.
If you think that B extends A, you'll probably use a text-by-text scheme; if you see A and B engaged in debate, a point-by-point scheme will draw attention to the conflict. Be aware, however, that the point-by- point scheme can come off as a ping-pong game. You can avoid this effect by grouping more than one point together, thereby cutting down on the number of times you alternate from A to B. But no matter which organizational scheme you choose, you need not give equal time to similarities and differences. In fact, your paper will be more interesting if you get to the heart of your argument as quickly as possible. Thus, a paper on two evolutionary theorists' different interpretations of specific archaeological findings might have as few as two or three sentences in the introduction on similarities and at most a paragraph or two to set up the contrast between the theorists' positions. The rest of the paper, whether organized text- by-text or point-by-point, will treat the two theorists' differences.
You can organize a classic compare-and-contrast paper either text-by-text or point-by-point. But in a "lens" comparison, in which you spend significantly less time on A (the lens) than on B (the focal text), you almost always organize text-by-text. That's because A and B are not strictly comparable: A is merely a tool for helping you discover whether or not B's nature is actually what expectations have led you to believe it is.
Linking of A and B . All argumentative papers require you to link each point in the argument back to the thesis. Without such links, your reader will be unable to see how new sections logically and systematically advance your argument. In a compare-and contrast, you also need to make links between A and B in the body of your essay if you want your paper to hold together. To make these links, use transitional expressions of comparison and contrast ( similarly, moreover, likewise, on the contrary, conversely, on the other hand ) and contrastive vocabulary (in the example below, Southerner/Northerner ).
As a girl raised in the faded glory of the Old South, amid mystical tales of magnolias and moonlight, the mother remains part of a dying generation. Surrounded by hard times, racial conflict, and limited opportunities, Julian, on the other hand , feels repelled by the provincial nature of home, and represents a new Southerner, one who sees his native land through a condescending Northerner's eyes.
Copyright 1998, Kerry Walk, for the Writing Center at Harvard University
- How to Read an Assignment
- How to Do a Close Reading
- Developing A Thesis
- Topic Sentences and Signposting
- Transitioning: Beware of Velcro
- Ending the Essay: Conclusions
- Brief Guides to Writing in the Disciplines
- Schedule an Appointment
- Drop-in Hours
- English Grammar and Language Tutor
- Harvard Guide to Using Sources
- Writing Advice: The Harvard Writing Tutor Blog
- Departmental Writing Fellows
- Videos from the 2022 Three Minute Thesis Competition
The data analysis report isn’t quite like a research paper or term paper in a class, nor like aresearch article in a journal. It is meant, primarily, to start an organized conversation between you and your client/collaborator.
Structuring your essay according to a reader's logic means examining your thesis and anticipating what a reader needs to know, and in what sequence, in order to grasp and be convinced by your argument as it unfolds. The easiest way to do this is to map the essay's ideas via a written narrative.
The frame of reference may consist of an idea, theme, question, problem, or theory; a group of similar things from which you extract two for special attention; biographical or historical information. The best frames of reference are constructed from specific sources rather than your own thoughts or observations.