Mitosheet is simply a spreadsheet that exists inside of your JupyterLab notebook. This allows you to very quickly bounce back and forth between doing data analysis in Python and exploring/manipulating your data in the Mitosheet. In this article we discuss how to carry out instant data analysis using Mitosheet.

This article was originally published by Mito.

Because the Mitosheet is optimized for your Python data analysis, every tab inside of your Mitosheet represents a Pandas dataframe. To demonstrate this dataframe aspect, let’s take a look at displaying a single dataframe (of Tesla stock prices) in a Mitosheet.

Mitosheet can also help you easily import your CSV and Excel files into dataframes with just the click of a button. You can see more about importing data into Mito.

Exploring Your Data in the Mitosheet

Once we’ve got a dataframe displayed in the Mitosheet, we can immediately begin to explore our data like we were in a standard spreadsheet. Let’s highlight a few ways to use the Mitosheet to explore your dataset.

Immediately Visible Data

.Unlike just printing out a dataframe, Mito immediately allows you to see your entire data set. As you can see from the above screenshot, Mito makes it incredibly easy to:

  1. 1.See the column headers of your dataframe.
  2. 2.Scroll through the rows of your dataframe and look for data points of interest.
  3. 3.View the types of each of the columns in your dataframe.
  4. 4.See the number of rows and colums in your dataframe.

There are a variety of more advanced ways to explore and understand your data, easily allowing you to access column summary statistics, graph your data, and more.

Editing Your Data in the Mitosheet

On top of just allowing you to explore your data in a spreadsheet, the Mitosheet allows you to easily edit your data. You can find most of the basic actions in the toolbar at the top of the Mitosheet.Included actions allow you to edit columns in your dataframe, create pivot tables, merge multiple dataframes together, and much more.

Furthermore, you can use the Action Search bar to search for any functionality that is included in the Mitosheet. If you’re not sure how to find some functionality in the Mitosheet, try searching for it!

Editing Columns in the Sheet

One of the most basic operations you may perform in any spreadsheet is adding a column, and writing a spreadsheet formula. Let’s use the Mitosheet to calculate the Tesla stock price movement over each trading day.

  1. First, use the add column button to add a column to the dataset.
  2. Then, double click on the column header to edit it directly and rename it to Movement.

Now that we have a new column, we can write a formula inside of it! Simply:

  1. Double click on a cell in the column
  2. Set the formula, referencing the other column headers by their name.

After pressing enter, we can see we’ve easily calculated a new column in our dataframe, just like we would in a spreadsheet.

The Generated Code

For each edit you make to the Mitosheet, Mito generates code below that corresponds to this edit. As you can see in the screenshot below, we’ve added a column and set its formula, and the code that we generated does the very same:

Running the Generated Code

Now that you’ve generated some code, you can run it! By running this generated code, you can lock in the changes you made to your dataframes in the Mitosheet (the edits you make in the mitosheet only apply to a copy of the dataframe by default). To use the edited dataframes in the rest of your analysis, just run the generated code, and keep writing Python below as you normally do!

Generated Code and Replaying an Analysis

If you make a call to mitosheet.sheet() to render a Mitosheet, and the cell below this call contains code that was generated by Mito, than the new Mitosheet will attempt to replay the edits you made onto the new dataframes that are passed to the mitosheet.sheet() call. To illustrate how this works, consider the example above where we added a column to the Mitosheet, and set it’s formula. If we were to then:

  1. Rerun the cell importing the tesla-stock.csv file as a dataframe.
  2. Rerun the cell rendering the mitosheet with mitosheet.sheet(tesla-stock)

The mitosheet would add the Movement column to the dataframe, and set it’s formula to Close - Open as in the previous analysis. This is because the generated code cell is right below the mitosheet.sheet call, and so Mito attempts to replay this analysis on the new dataframe that is passed (although the dataframe in this case happens to be identical).

Thus, if you switch the parameters to a mitosheet.sheet() , you need to make sure that the columns you edited in your analysis still exist! If we passed a dataframe of tesla_stock to the mitosheet.sheet()call that did not contain the Open or Close columns, then an error would occur as the formula that the Mitosheet attempts to reapply no longer is valid (as the columns no longer exist).

About the author Staff

Showcasing and curating a knowledge base of tech use cases from across the web.

TechForCXO Weekly Newsletter
TechForCXO Weekly Newsletter

TechForCXO - Our Newsletter Delivering Technology Use Case Insights Every Two Weeks