Editing
Comparison of common data file formats in Mandarin
(section)
Jump to navigation
Jump to search
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
=== 5. SQLite 或 Parquet 檔案 === SQLite 或 Parquet 檔案:當資料量超過 Excel 筆數上限,可以考慮使用這兩種格式。 SQLite 每張資料表的理論筆數上限是 2⁶⁴ 筆(約 1.8 × 10¹⁹),但這個數字實際上不可能達到,因為資料庫檔案大小上限 281 TB 會先被遇到。在最大資料庫容量下,可儲存的筆數約為 2 × 10¹³ 筆(前提是沒有索引且每筆資料極小)。<ref>SQLite 官方文件 — Implementation Limits For SQLite: https://sqlite.org/limits.html</ref> Parquet 沒有硬性的筆數上限。格式本身以 Row Group 為單位儲存資料。Row Group 是 Parquet 檔案內部的水平分割單位,每個 Row Group 包含一段連續的資料列,以欄位為單位分開存放——例如一個有 100 萬筆、10 個欄位的資料集,在一個 Row Group 裡會被切成 10 段欄位資料分別儲存,而非逐列存放。這樣的設計讓查詢時只需讀取需要的欄位,不必掃描整列,大幅提升讀取效率。每個 Row Group 預設上限為 100 萬筆,但一個檔案可以包含任意數量的 Row Group,因此整個檔案的筆數沒有理論上限 <ref>Apache arrow-rs GitHub issue #5797 — "Row groups are limited to 1M rows by default": https://github.com/apache/arrow-rs/issues/5797</ref>。實務上已有人成功寫入 5 億~ 10 億筆資料<ref>Andy Cutler — 10 Billion Rows: Parquet File Size and Distribution When using CETAS: https://www.serverlesssql.com/row-size-and-parquet-file-distribution/</ref>,瓶頸通常是硬碟空間,而非格式本身。
Summary:
Please note that all contributions to LemonWiki共筆 are considered to be released under the Creative Commons Attribution-NonCommercial-ShareAlike (see
LemonWiki:Copyrights
for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource.
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Navigation menu
Personal tools
Not logged in
Talk
Contributions
Log in
Namespaces
Page
Discussion
English
Views
Read
Edit
View history
More
Search
Navigation
Main page
Current events
Recent changes
Random page
Help
Categories
Tools
What links here
Related changes
Special pages
Page information