Andrew Heiss :rstats: on Nostr: So I've seen #rstats people rave about DuckDB and Parquet but haven't really had a ...
So I've seen #rstats people rave about DuckDB and Parquet but haven't really had a chance to see what it does—until today! I came across a 2.5 GB Stata file that {haven} couldn't open, so I used Python to convert it to Parquet, then read it into R! File is now only 200 MB and R memory usage is tiny!
Published at
2024-04-26 20:06:13Event JSON
{
"id": "a54b12480c97102a20734735f3bcc91b1f3dae7dc1879c500669e459fc0a3730",
"pubkey": "9126464ea6755c429eb0db5855b256863513dba63cfaa5da61ad6455e1857d02",
"created_at": 1714161973,
"kind": 1,
"tags": [
[
"t",
"rstats"
],
[
"proxy",
"https://fediscience.org/users/andrew/statuses/112339319080184378",
"activitypub"
]
],
"content": "So I've seen #rstats people rave about DuckDB and Parquet but haven't really had a chance to see what it does—until today! I came across a 2.5 GB Stata file that {haven} couldn't open, so I used Python to convert it to Parquet, then read it into R! File is now only 200 MB and R memory usage is tiny!\n\nhttps://fediscience.org/system/media_attachments/files/112/339/315/492/040/646/original/50d40c48404f3ea9.png\n\nhttps://fediscience.org/system/media_attachments/files/112/339/315/700/212/339/original/48489ded73d9a33e.png",
"sig": "ac6699c69286d3229090599298e5a1f8ee7e73f51437a57aca1a00c89970a02aeb6e1ce4bd25fd7ddc464e044398120b1a179f5c466bee7a4bfbc1b2b935734a"
}