Get help from the marimo community

The notebook loads a couple of Pandas dataframes, each with 5-10M rows, filters each them down to 3-5M rows, samples 10% of them, and plots various charts. Unsure whether to allocate more resources, make my notebook more resource efficient, or something else. I can provide the stacktrace if helpful.
2 comments
A
Hi team, thanks for your work on this package. Apologies for the dumb question, I have been using large datasets and used altair for dynamic plotting with slider filters.
Using on jupyter:
alt.data_transformers.enable("vegafusion")
alt.renderers.enable("jupyter", offline=True), this is very fast (less than 1 sec for the update).
I tried to achieve the same filtering using marimo sliders directly on my polars dataframe and then plotting, this works but it is slower as obviously polars needs to do the filtering and run 2 cells (2-5 seconds).
For this use case, the JupyterChart (jupyter renderer) is faster, is there any plan to have it supported in marimo ? (Other renderers being too slow for large datasets)
20 comments
M
A
When sharing a presentation or app, it is unclear if the app is still loading, or how much of it is loaded.

This is a problem if you are sharing the app with someone who is unfamiliar with the platform. If it's taking a while, content might appear to simply not exist or be missing.

There's also a bit of a delay upon initial page load of the app and presentation. After the "loading dependencies" spinner, there's a gap between that and the hour glass spinner in the top left.

With presentation mode, it is even more unclear. It only shows slides / circles at the bottom, only for cells that have executed and have content. Someone might flip through all the slides but to realize all of the content is not there.

Ideally, on the app in the presentation, it would show "Still loading all content, please wait to see all components" with a progress bar.
3 comments
M
r
We run an internal reporting application that I would like to speed up significantly. The flow of these reports are all pretty simple:

  1. Pull Data
  2. Run a bunch of calculations
  3. Build visualizations
  4. Display
What I think Marimo is probably capable of, but it isn't super clear to me yet? Is doing something like where we cache the report once it has been calculated. And if the same report is calculated again, just pull in the html that was previously cached. If it is the first time, build the full report and cache that html.

Are there any examples around this? What feels weird to me is a notebook caching itself as html programatically and potentially displaying a cached version of itself. But it seems like the tools are probably already in place to do this?
See the attached code. If I assign a large numpy array to a cell-scoped variable, then after the cell is done running the memory is still occupied (about 7.5 GB). If I rename the cell-scoped variable and rerun, then the old memory still remains occupied and a new, large chunk of memory is allocated (so now taking about 15 GB).
4 comments
A
e
Hi, I'd like to try out marimo to compose music, but I can't play midi files. For instance,

Plain Text
mo.audio(src="https://bitmidi.com/uploads/79828.mid")


... will only display a music player with 0:00 / 0:00.
12 comments
M
e
Continuing the conversation with @evandertoorn

Memory Management


How does marimo keep things in memory? marimo internally manages a "globals" dict shared between all cells, everything that is defined is put into this dictionary. The dag primarily works with a static code analysis without respect to what has already been defined etc, to determine the order in which to run cells. Since the global dict is persistent during the session, it could potentially lead to memory build up. However, instead, variables are removed and collected by marimo on cell invalidation.
9 comments
d
e