Fatal motor vehicle accidents –
interactive data visualisation
This interactive data visualisation was developed in the 2nd semester in the course Programmed Design II. The main focus was the examination of the ordering relationship between shape & colour in dynamic models. The semester task was to design an interactive data graphic. The starting material was either general classical calendrical principles or specific summarised data. These were to be made visually experienceable in dynamic or interactive form. The means of representation were limited to abstract, geometric forms & the systematic use of colour. Even though the use of any alphanumeric characters was prohibited, the claim remained that all levels of information contained in the data graphics should be as clearly readable as possible for the viewer.
My interactive data visualisation deals with fatal vehicle accidents in the USA in 2013. The first screen should give an overview of the context in which the user finds himself & what the topic is. In a second step the user should get a feeling for how much data is involved & into which data groups it is divided. Only in the third screen detailed information & connections between the data should be visualised. The innermost ring of the diagram shows the number of accidents per month. As soon as a month is selected, it is shown how the accidents are distributed over the weekdays within this month. By selecting a weekday, a third ring appears, which shows the number of accidents per hour within the selected month on a specific weekday.
Design Process –
detailed & extensive
The data used were raw data. Each data set (each line) represented a fatal accident in the USA in 2013. 53 characteristics were recorded for each data set. With regard to the characteristic values, nominal & ordinal values were represented as well as values that can be depicted on an interval or ratio scal
Since not all characteristics are clearly identified & no explanation could be found, the first step of data cleaning was to remove these characteristics. I also removed characteristics whose expression or difference was not clear. Examples are the characteristics “WEATHER1”, “WEATHER2”, “WEATHER” with the values 0-10. Characteristics which made the same statement, such as “Geo Point”, “Geo Shape” & ” LATITUDE”, “LONGITUD” were deleted accordingly. Thus the number of characteristics was reduced from 53 to 22. Afterwards I removed columns that I felt went too deep in detail or always had the same feature value, such as the feature “CITY” or “YEAR”. In the end, 19 characteristics were left over, with which I continued the data cleaning process. The next step was to eliminate gaps & incorrect data entry (typing errors), which reduced the number of events from 30,057 to 7,062.
In order to be able to summarise the adjusted data, I determined dimensions according to which I wanted to summarise, as well as key figures with which I could make calculations. To do this, I wrote down all the characteristics & created combinations that seemed to make sense.
The requirements prohibit any use of text, icons, numbers, operating elements such as buttons or sliders, as well as the use of pictorial forms. As a result, I was able to create designs in terms of colour, (abstract) shape, size, positioning or arrangement, transition or animation & interaction. First of all, I thought about each individual design possibility, to what extent I could use it in a planned & structured way.
collection of ideas
After the basic considerations of the design possibilities I collected first ideas for a data visualisation. Moreover I put together a list of questions to help me with the concept. This list contained questions like: „What perspectives on the data are there?“ or „Which aspects can be linked & represented in an understandable way despite the restrictive framework conditions?“
possible data visualisations
Below is a selection of considerations for the data visualisations I made before starting to code.
During the semester, the presentation of the data as well as the focus of the data visualisation changed again and again. While at the beginning a cartographic data visualisation seemed very appealing to me in order to clarify the context of the data, I later decided against such a representation. Aggregated data seemed to me to be more exciting in a thematic context. I also developed many representations of accidents per state. Since the differences in the number of accidents per state vary greatly and the proportion of accidents involving drunk drivers also varies, interesting variations emerged. One negative aspect of these representations, however, was that the data cleansing eliminated eight of the 50 existing states in the USA. Thus, only the remaining 42 states could have been visualised. This could have possibly irritated the viewer by the deviation of states. Due to the omission of geographical features, a representation of the temporal aspects seemed suitable, especially since many features were available per data set.
Based on the focus on the aspect of time, I developed a new concept for interactive data visualisation. The basic idea here was to give all representations a round shape or to arrange all elements radially, as in a classical clock. In this way I wanted to achieve the desired uniformity of shape. The radial arrangement, however, cannot be applied to all features concerning time. Thus every January is followed by February and at the end of the year by December, before becoming January again. The rhythm of the days of the week and the times of day also remains the same. Only for specific days of a month does the display not work, because the 30th of a month is not always followed by the 31st. For this reason I wanted a representation of the accidents per month, from which it was possible to switch to a representation of the accidents per weekday within a month. Starting from this representation, the most detailed view should follow, showing the number of accidents per hour per weekday within one month. Before the user reaches this level of detail, however, an overview of the number of data records used should first be created.
Apart from the mere representation of the number of accidents in total or per time unit, an additional dimension should be added to show further correlations. This additional dimension should be coded using colour. As already shown in the pre-visualisations, the percentage of drunk drivers varies according to state, month, day of the week and time of day and is therefore perfectly suited as an additional dimension. Each element of the radial diagrams should be assigned a colour according to the percentage of drunk drivers.
Up to this point I worked with the data set of 7,062 events. All previous sections & the considerations & decisions contained therein referred to the 7,062 entries in the data set. Since the concept I am now pursuing contains fewer features than I had initially thought, I returned to the original data set. I cleaned it up so that this time I really deleted all features that were not used in my concept. So the list of the original 53 features was reduced to Longitude, Latitude, Month, Weekday, Hour, Drunk Drivers. Since each event now had to have fewer characteristic values, 29,426 entries remained in the data set. From now on I used this newly cleaned up data set.
Due to the set requirements, which prohibited the use of any alphanumeric characters & pictorial forms, it seemed very important to me during the development of the screens to guide the user through the screens in a targeted manner. Only in this way it was possible to gain an understanding of the context of the individual screens. Based on this consideration, the first screen should give an overview of the context the user is in & the subject matter. In a second step the user should get a feeling for how much data is involved & into which data groups it is divided. Only in the third screen should detailed information & connections between the data be visualised.
After determining which screen should convey what, I compared this with the available design options & looked at which aspects could be best conveyed by which design options. The following illustration shows my transitions to a possible coding.
Parallel to the development of the individual screens, I thought of a user concept that was as intuitive as possible. Since the three different screens are based on each other, it seemed to me to be reasonable to arrange them linearly & in reading direction from left to right. This resulted in a division of the screen into three parts in order to switch between the visualisations. In order to make this concept obvious to the user, a small rectangle should be inserted at the bottom of each screen to indicate in which of the three representations you are.
Already during the concept development I tried to implement single screens with Snap.svg. Due to the high amount of data I wanted to display individually, I soon reached the limits of what was possible when it came to animating the individual circles. Nevertheless I wanted to realise the project with Snap.svg. The reason for this was that the radial bar chart could in principle be implemented very well with it & I found it the most exciting part of my data visualisation. The first concept behind the radial bar chart was that I wanted to use paths to draw individual circle segments with surfaces. I later changed this concept by not drawing any more flat segments of a circle, but only one path & the original area by assigning the appropriate line width to the individual path. This gave me a better performance & the animations of the individual segments of the diagram became more fluid. It did not take much time for the radial pie chart to have the desired visual impact, whereas it took me a lot of time & effort to program the functions triggered by clicking on a segment correctly, so that the appropriate data was always displayed & also animated correctly.
As mentioned at the beginning, Snap.svg was not suitable for animating the just over 29,400 circles. However, since animation made a significant contribution to understanding the data visualisation & I wanted a smooth transition between the individual screens, I had to look for a solution. The solution was to use the library p5.js for the first two screens. Unlike Snap.svg this library is not based on a svg but on a canvas. Mapping the accidents so that they depicted the USA was quickly realised, as I could fall back on individual code parts from an earlier project & only had to integrate the new data set & make a few adjustments. After I had understood the mathematical background of my planned radial arrangement of the individual circles, I was able to implement this visualisation statically relatively quickly. The only difficulty was the display of the outermost, mostly not completely filled ring. But also for this I found a solution. For the animation from the first to the second screen I included Tween.js as an additional library. A big challenge for me was to create the transition from the first two screens to the third screen & back again to one of the first two screens. It was also difficult to get a smooth transition of the animations.
Key Learnings –
programming needs time & patience
In retrospect I can say that the implementation of the concept took a lot of time, because with many screens I first had to understand the mathematical logic behind them before I could start programming. During the programming itself I was always lacking the knowledge about the existence of certain commands & possibilities of implementation. On the one hand, this knowledge was conveyed to me through assistance & suggestions from the lecturers, on the other hand I acquired it myself through very time-consuming research. The time invested in programming was worth it in my opinion, because I was able to implement almost everything exactly as I had imagined it.
Project information –
Any questions? –
let’s get in touch
If you have any questions about the project or want to know more about it, just write me a short message.
I am happy about any suggestions and comments.