Airbnb Data

Airbnb Data

Adam Stoddard The Airbnb marketplace is very diverse: apartments and other housing could consist of anything from a bedbug ridden couch to a glamorous full floor penthouse. How do we quantify this database? Some of the most interesting factors to look at are text-based. Airbnb includes text descriptions of the apartments and ‘about’ the host. The descriptions of apartments, using both cosine similarity and topic modeling, are about what you would expect: descriptions consist of words people use to describe housing: beds, baths, location, access, nearby restaurants, subways, bars, etc. But the topic modeling on host descriptions can be enlightening, allowing us to see how people think of themselves. Some hosts group themselves into categories, which could involve being a “professional” who “enjoys” “traveling”, or an “artist” in “Brooklyn” who spends time with “girlfriends.” Host topic modeling using gemsim: 0 place family friends much girlfriends school good entrepreneur ive give 1 really de going also huge always living et vous things 2 living brooklyn ny manhattan people two well park best walk 3 things people time reading make moved meeting year amazing see 4 month great place also architect kyle couple married home ny 5 live favorite years enjoy garden travel home living like life 6 travel professional easy going organized time make clean park slope 7 great work host travel neighborhood see manhattan good currently trip 8 stayed living writer good dogs editor comfortable shows owner magazine 9 great restaurants yorker home space please event traveling way easy What does the marketplace look like? The following histogram shows the number of bedrooms: One bedrooms clearly dominate, with far more units than either studios or two...