The dataset contains date, stock open/close prices, stock price daily high/low and volume of shares traded over the past 5 years as of October 15th, 2021.
Data Cleaning and Preparation
In this stage the data is checked for accuracy and completeness prior to beginning the analysis. Some of the issues addressed are as follows:
- Remove extraneous data
- Check for in missing values
- Replace missing values
- Delete data that cannot be corrected/replaced
- Correct any data formatting issues
- Creating new features
- Identify errors revealed when new variables are created
Missing Values
#Identifying total number of missing values
sum(is.na(tesla))
[1] 0
There are no missing values.
Creating New Features
Moving Average
We will create a new column to track the moving average of the daily Closing Stock Price over a set period. The moving average smooths out the stock price price data by creating a constantly updated average stock price. We will track the moving average across two periods:
- 7 Day Period
- 30 Day Period
#Calculating a moving average for stock price over the last 7 days
tesla$Moving_Average_7_Day <- round(ma(tesla$Close.Last, 7),2)
tesla$Moving_Average_30_Day <- round(ma(tesla$Close.Last, 30),2)
Identify Issues Revealed due to New Features
The Moving averages calculate an average based on previous data over the set period. However, for cases where there is not enough previous data, the moving average cannot calculate a value and leaves a missing value. To rectify these missing values we will have to remove the associated rows.
#Removing rows with missing data
tesla <- na.omit(tesla)
Exploratory Data Analysis
In this stage, we will examine the data to identify any patterns, trends and relationships between the variables. It will help us analyze the data and extract insights that can be used to make decisions.
Data Visualization will give us a clear idea of what the data means by giving it visual context.
Statistics
Lets take a look at the data as a whole to understand how the Tesla Stock Price has varied over the last 5 years.
summary(tesla)
Date Close.Last Volume Open High Low Moving_Average_7_Day
Min. :2016-11-07 Min. : 35.79 Min. : 9800558 Min. : 36.22 Min. : 36.95 Min. : 35.40 Min. : 37.02
1st Qu.:2018-01-28 1st Qu.: 57.88 1st Qu.: 25252174 1st Qu.: 57.53 1st Qu.: 58.98 1st Qu.: 56.53 1st Qu.: 57.73
Median :2019-04-17 Median : 68.04 Median : 35270215 Median : 68.06 Median : 69.22 Median : 66.92 Median : 68.02
Mean :2019-04-17 Mean :202.45 Mean : 44992494 Mean :202.33 Mean :206.90 Mean :197.71 Mean :202.47
3rd Qu.:2020-07-07 3rd Qu.:274.46 3rd Qu.: 53038012 3rd Qu.:279.20 3rd Qu.:282.14 3rd Qu.:263.54 3rd Qu.:271.12
Max. :2021-09-23 Max. :883.09 Max. :304693800 Max. :891.38 Max. :900.40 Max. :871.60 Max. :859.24
Moving_Average_30_Day
Min. : 37.84
1st Qu.: 58.33
Median : 67.24
Mean :202.56
3rd Qu.:257.06
Max. :836.38
As seen in the summary above:
- Oldest data point is from 2016-11-07 and the most recent data point is from 2021-09-23.
- Highest Closing Stock Price was $883.09, while the lowest Closing Stock Price $35.79.
- Highest Opening Stock Price was $891.38, while the lowest Opening Stock Price $36.22.
- The Highest Stock Price over the past 5 years was $900.40.
- The Lowest Stock Price over the past 5 years was $35.40.
- The Volume of Stocks traded averaged around 45 million with a maximum of around 304 million.
Closing Stock Price
#Plotting the Closing Stock Price over the last 5 years
ggplot(tesla, aes(x=Date, y=Close.Last )) +
geom_line() +
labs(title="Tesla Closing Stock Price Over the Last 5 Years", x="Year", y="Tesla Closing Stock Price ($)") +
scale_x_date(date_minor_breaks = "1 month") +
theme_bw() +
theme(axis.text.x = element_text(size = 10), axis.title.x = element_text(size = 12),
axis.text.y = element_text(size = 10), axis.title.y = element_text(size = 12),
plot.title = element_text(size = 15))
As seen previously above there is a steep rise in the Tesla Stock Price since early 2020. Lets take a closer look at the stock price during this time.
Closing Stock Price since Jan 2020
#Plotting the Stock Price since beginning of Jan 2020 onwards
ggplot(tesla, aes(x=Date, y=Close.Last )) +
geom_line() +
labs(title="Tesla Closing Stock Price in 2020", x="Month", y="Tesla Closing Stock Price ($)") +
scale_x_date(date_breaks = "1 month", date_labels = "%b", limit=c(as.Date("2020-01-01"),as.Date("2021-09-23"))) +
ylim(0,900) +
theme_bw() +
theme(axis.text.x = element_text(size = 10), axis.title.x = element_text(size = 12),
axis.text.y = element_text(size = 10), axis.title.y = element_text(size = 12),
plot.title = element_text(size = 15))
Warning: Removed 792 row(s) containing missing values (geom_path).
The stock price has continued to increase significantly since Jan 2020, reaching a peak around Feb 2021.
#The most recent stock price
recent_stock <- tesla %>% filter(Date == max(tesla$Date))
#Stock price at the beginning of 2020
#Note: No data for date 2020-01-01
stock_start_2020 <- tesla %>% filter(Date == as.Date("2020-01-02"))
#Stock price at the end of 2020
stock_end_2020 <- tesla %>% filter(Date == as.Date("2020-12-31"))
#Calculating the percentage increase since the start of 2020 and the most recent date
pct_recent = round((((recent_stock$Close.Last-stock_start_2020$Close.Last)/stock_start_2020$Close.Last)*100),0)
#Calculating the percentage increase between the start and end of 2020
pct_2020 = round((((stock_end_2020$Close.Last-stock_start_2020$Close.Last)/stock_start_2020$Close.Last)*100),0)
cat("The stock price increase: \n Jan 2020-Present",pct_recent,"% \n Jan 2020 - Dec 2020 End:",pct_2020,"%")
The stock price increase:
Jan 2020-Present 776 %
Jan 2020 - Dec 2020 End: 720 %
The stock price has increased 776% between the beginning of 2020 and the present. In fact, it increased by 720% just in 2020 alone.
Moving Average
#Plotting the Closing Stock price vs the 7 Day Moving Average
ggplot(tesla, aes(x=Date)) +
geom_line(aes(y = Close.Last, colour="Close Price")) +
geom_line(aes(y = Moving_Average_7_Day, colour="7 Day")) +
labs(title="Tesla Stock Price vs. 7 Day Moving Average", x="Year", y="Tesla Stock Price ($)", colour = "Price & Moving Average") +
scale_color_manual(values = c("black","tan3")) +
scale_x_date(date_minor_breaks = "1 month") +
theme_bw() +
theme(axis.text.x = element_text(size = 10), axis.title.x = element_text(size = 12),
axis.text.y = element_text(size = 10), axis.title.y = element_text(size = 12),
plot.title = element_text(size = 15))
#Plotting the Closing Stock price vs the 30 Day Moving Average
ggplot(tesla, aes(x=Date)) +
geom_line(aes(y = Close.Last, colour="Close Price")) +
geom_line(aes(y = Moving_Average_30_Day, colour="30 Day")) +
labs(title="Tesla Stock Price vs. 30 Day Moving Average", x="Year", y="Tesla Stock Price ($)", colour = "Price & Moving Average") +
scale_color_manual(values = c("black","orange")) +
scale_x_date(date_minor_breaks = "1 month") +
theme_bw() +
theme(axis.text.x = element_text(size = 10), axis.title.x = element_text(size = 12),
axis.text.y = element_text(size = 10), axis.title.y = element_text(size = 12),
plot.title = element_text(size = 15))
The 30 Day Moving Average gives us a much smoother curve than the 7 Day Moving Average. Therefore, using the 30 Day Moving Average we can clearly see the points where the stock price deviated noticeably from the predicted price. The stock price fluctuations seem to be most significant in the period after early 2020, which correlates with our earlier observation.
Stock Trading Volume
#Plotting the Closing Stock Price over the last 5 years
ggplot(tesla, aes(x=Date, y=Volume, color ="orange")) +
geom_point() +
geom_line(aes(y = 45000000, color="red")) +
labs(title="Tesla Stock Trading Volume Over the Last 5 Years", x="Year", y="Number of Stocks Traded") +
scale_x_date(date_minor_breaks = "1 month") +
theme_bw() +
theme(axis.text.x = element_text(size = 10), axis.title.x = element_text(size = 12),
axis.text.y = element_text(size = 10), axis.title.y = element_text(size = 12),
plot.title = element_text(size = 15)) +
theme(legend.position = "none")
The volume of stocks traded has generally remained around or been under the average of 45 million (indicated by the line), however there is an increase in volume during the year 2020, though the it does seem to be getting back to normal since the beginning of 2021.
Summary of Data Analysis
- Oldest data point is from 2016-11-07 and the most recent data point is from 2021-09-23.
- Highest Closing Stock Price was $883.09, while the lowest Closing Stock Price $35.79.
- Highest Opening Stock Price was $891.38, while the lowest Opening Stock Price $36.22.
- The Highest Stock Price over the past 5 years was $900.40.
- The Lowest Stock Price over the past 5 years was $35.40.
- The Volume of Stocks traded averaged around 45 million with a maximum of around 304 million.
The Tesla Stock Price has increase significantly since 2016. The stock price remained relatively steady until early 2020 when it began to rise drastically. The stock price has increased 776% between the beginning of 2020 and the present. In fact, it increased by 720% just in 2020 alone. While the stock price reaches its peak around Feb 2021, it has continued its general upward trend. However, during the period of price increase the Daily stock price would fluctuate significantly compared to its relatively stable price prior to 2020.
The volume of stocks traded during the last 5 years has generally remained around or been under the average of 45 million, however there is an increase in volume during the year 2020, which corresponds to the steep increase in stock price. We would typically expect an increase in trading activity like this during a period when the company’s stocks are rapidly rising.
