This article will make Boston’s Airbnb hosts rethink their investment strategy.

Based on analysis of Airbnb listing data in Boston

Vaibhav Malhotra
5 min readMay 24, 2020

Airbnbs can inflow a sizable amount of earnings but owning this business is a difficult task. This article is for the owners to understand the market and make some simple but efficient decisions.

An important question a new host can ask is “How can I invest minimum but get higher rents”. The answer to that is simple- invest where it actually matters the most. In this blog, we will be using regression-based ML models to predict Airbnb prices and use the model to answer the above questions.

Let’s find answers to the following:

- Is it possible to accurately predict the most attractive price of your Airbnb?

- What services will actually help you raise the price?

- What are the most favored neighborhoods?

In case you want to have a deeper dive you can find the GitHub for my project here and dataset here.

Let’s now jump right into it…

A quick look at the Data

The dataset has around 3585 entries under 95 columns (Woww..!).

The 95 columns names

But are all the columns useful? Which columns really have an impact on the ‘price’? I pass the data through an ETL(Extract Transform Load) pipeline, which involved cleaning data, transforming data into a useful format, handling null values, etc, and finally store it into a CSV file for further use. Have a look at the code here.

A look at the ‘price’ variable:

Wow! a price of $4000 seems quite expensive. Exploring the dataset showed that though the extremely high-value rentals may sometimes be actual mansions or whatnot for rent, for the most part they’re scalpers setting unrealistic valuations or just joke listings. There are known as outliers. We can find outliers using Tukey rule keeping rows with a price < $500.

Is it possible to accurately predict the most attractive price of your Airbnb?

I used pycaret’s regression model for prediction. You can have a look at the code here.

The image on the right represents the result of the model, our best model predicts the price with RMSE of $41.25 meaning that our model is wrong by that much on average. But don’t worry our purpose is to find the major players when predicting price.

Below is the SHAP representation for top variables contributing towards out price. This seems interesting, but what exactly is SHAP representation. There is an amazing article explaining here.

But wait! What does it mean for us, let's look at it.

  • Clearly, an Entire home or Apartment has the biggest impact on the price, so it is for the number of bedrooms and bathrooms. These parameters alone can shift the price of your listing.
  • Latitude and Longitude meaning the location is important as well, we will discuss this later in the post.
  • Availability throughout the year and acceptance rate of the host is an important factor. Higher values for both will benefit you.
  • As expected, the higher the number of minimum nights less popular will be Airbnb.
  • It is interesting to note that rentals with higher rents tend to charge little or none for the cleaning fee.

What services will actually help you bump the price?

In case of amenities, ‘More the merrier’: every additional amenity offered will hypothetically be a major in addition to the guests who really need it and a not-negative for the guests who don’t. Hence there’s an incentive for a host to list irons, kitchens, pool.

But there’s a catch. We can see that mentioning basic amenities such as hangers, lock on bedroom door, hairdryers have a negative impact on the listing. This could be because there are very basic necessities and should not be mentioned as a special feature.

  • The easiest things to increase price are to include a TV, Cable, and an Indoor Fireplace. Also, including a free on-premise parking space add a bonus.
  • Investing in safety devices like Fire Extinguisher, Safety card, Intercome, and smoke detectors will not only help you safeguard the place but also earn a few extra bucks.

What are the most favored neighborhoods?

We can see that neighborhood does indeed, unsurprisingly has a strong effect on pricing.

Areas with high average prices

As expected, the areas near the city center cost on average $50 or more than the ones far from the city. It’s the premium you pay to be ‘right-there’ location wise which honestly sounds good to me.

Conclusion

  • Predicting the exact price is difficult but many parameters impact the price and popularity of the place.
  • Airbnb listing price is mostly determined by the type of house, location, number of bedrooms, and bathrooms.
  • Amenities play a crucial role and are the best variable owner can play with to increase the rental value.
  • People are willing to pay for Airbnb’s nearer to the city center.

What to do next?

Performing sentiment analysis on text data columns like summary, reviews etc will make that data useful. Test the impact of more data like date, weather and special events on the pricing.

Extending the analysis on other cities like Seatle, New York.

Stay tuned..!!

If you found this analysis interesting or have additional ideas about how to extend it, I’m always happy to chat! My DMs are open on LinkedIn and I can be reached at my website https://vaibhav3m.github.io

In the meantime you can have a look at the project below:

--

--