Data Visualization ================================= After all data has been generated, a graph is usually created to help visualize the relationship and draw some conclusions about the dataset. This document will show how to do this using ``seaborn`` package. The use case described here is from the paper *Modelling the Joint Effect of Social Determinants and Peers on Obesity Among Canadian Adults* which is described in :ref:`Customizing Node Values`. The model used in the paper will be referred as obesity model. .. Creating graphs: Creating graphs --------------------------------- Using ``seaborn`` to graph will require an object of class ``pandas.DataFrame``. A csv file that has data collected from running the obesity model with different values of p, 20 replications each, is used here. The model was also run with different types of networks, small world and scale-free, which can be indicated from the **Type** column. The first five rows of the file: .. csv-table:: :file: obesity_serial_sample.csv The complete dataset can be found here: :download:`obesity_serial.csv<../text/obesity_serial.csv>` The file can be loaded to a panda dataframe using ``pandas.pd.read_csv()``: .. code-block:: python import pandas as pd runs_data = pd.read_csv("obesity_serial.csv") The graph can be plotted and shown using both ``seaborn`` and ``matplotlib``. In this case, line plot is used. The complete code is shown below: .. code-block:: python import seaborn as sns import pandas as pd import matplotlib.pyplot as plt runs = pd.read_csv("obesity_serial.csv") sns.set_style("darkgrid") sns.lineplot(data=runs, x="p", y="Obesity", hue="Network Type", err_style="bars") plt.show() The result graph is displayed below: .. image:: ../img/serial_times.svg Comparing runtimes --------------------------------------- Graphs can be used to compare the runtime across multiple categories. In order to explore how different types of networks influence the runtime of the model for the parallel version, data was collected by running the obesity model with different types of networks and increasing number of agents. The data are then organized into a csv file. The first five rows are displayed below: .. csv-table:: :file: parallel_times_sample.csv The complete csv file can be found here: :download:`parallel_times.csv<../text/parallel_times.csv>` In order to use the above file for graphing, a pandas dataframe should be created and then ``seaborn`` and ``matplotlib`` to graph it. Parameter ``hue`` can be set as ``Type`` to group the data. Parameter .. code-block:: python import seaborn as sns import pandas as pd import matplotlib.pyplot as plt runs = pd.read_csv("parallel_times.csv") sns.set_style("darkgrid") sns.lineplot(data=runs, x="Agents", y="Time(minutes)", hue="Type", err_style="bars") plt.show() The resulted graph is displayed below: .. image:: ../img/parallel_times.svg There are different ways to organize the dataset to help with data Visualization. More details can be found on seaborn package page