The benefit of looking at a problem from a different angle
Incorporating different views or perspectives and combining knowledge from different disciplines can be very beneficial in academic study or in a business setting. It proves to be especially useful when finding ideas or solutions and can bring many new insights. During my internship at AlignAlytics last summer, I encountered several projects that demonstrated this, including a project I contributed to that focused on analysing car breakdown data for an automotive company.
The project specifically aimed to analyse the relationship of breakdowns to weather data. Using weather variables in addition to regularly available data such as; day of the week, holiday periods and reasons for breakdown to build a predictive model of call volume proved to be very successful. Individual variables were weakly predictive of the overall breakdown calls in a specific city, but the combination of several variables in a model gave good results.
When building this kind of model, it is important to think about the variables you are dealing with to try and understand the data. For the call volume model, for example, we explored the use of variables that contained weather data for the previous day. The reasoning for this was that if it snows a lot on one day, this might also affect the call volume on the next day, as perhaps people consequently choose to make their journey on a different day, or cars will have problems starting up the following day.
It is also important to think about the level at which the analysis is to be conducted. In this case, analysis and model building was performed on a city level, as the number of breakdowns and calls are likely to be very related to the situation in a city, such as the infrastructure and population size. The model also needs to be trained on an adequate amount of data, such as a whole year, so that different seasons are included.
Figure 1 shows a comparison between the two predictive models based on weather data and recorded call volume for the city Providence.
Figure 1 – Comparing random forest and multiple regression model for Providence
Taking a weather perspective can be relevant to a great variety of forecasting and modelling projects. In this case the model allows the prediction of car breakdown, which can help with planning the call responses in a city and to organise call centres. There are many other types of businesses that would benefit from modelling and forecasting based on weather conditions, such as businesses focusing on sales, outdoor related activities or the breakdown of machinery. Some interesting research has been conducted on this topic and it has been found that as exposure to sunlight increases, so does consumer spending (Journal of Retailing and Consumer Services).
Combining different disciplines – or ‘practising interdisciplinarity’ – was also a central theme in my Bachelor’s degree at UCL, fittingly called Arts and Sciences. It emphasised the importance of using knowledge, methods or skills from different disciplines when solving a problem or trying to reach a goal. Having experienced this in an academic setting, it was interesting to see this sort of combination in a company working in Data Science. I got an insight into how AlignAlytics achieves goals by trying out new things, combining people’s knowledge and skills and combining different technologies to optimise processes.
An example of this is their adoption of Google Cloud’s big query, which allows storage and querying of data. The Google Cloud Python Notebook gives the possibility to combine SQL queries, to get data from a table, and Python, to analyse this data, for example by using data frames. In the breakdown/weather analysis, the notebook made it possible to query the data for each city, and then train and test different models in Python (multiple linear regression and random forest model were used) to be able to forecast breakdown related calls. The model output could then be visualised in AlignAlytics’ visualisation software ‘Alytic’.
Figure 2 shows an Alytic – visualisation of cities used in the analysis. Their corresponding call volume (shown for 15th of April – it is possible to select specific dates on the Alytic chart) is represented by the size of the circle and the colour of the circle represents the maximum temperature.
Figure 2 – Comparing call volume between the cities
Data Science itself is already a combination of disciplines and skills, as it requires knowledge of data analysis and storage methods/technologies and knowledge on the topic or company the analysis focuses on, but many analyses could benefit from changing the way of looking at a problem. So, next time you are trying to solve an issue, or performing analysis, perhaps consider adopting a different perspective and have a look at the weather report.