Convert HTML Table To CSV/Excel: Difference between revisions
Jump to navigation
Jump to search
m (→Approach 2) |
m (→References) |
||
| Line 62: | Line 62: | ||
== References == | == References == | ||
* [http://stackoverflow.com/questions/3053503/javascript-to-get-rows-count-of-a-html-table JavaScript to get rows count of a HTML table - Stack Overflow] | * [http://stackoverflow.com/questions/3053503/javascript-to-get-rows-count-of-a-html-table JavaScript to get rows count of a HTML table - Stack Overflow] | ||
further reading | |||
* [http://stackoverflow.com/questions/1173194/select-all-div-text-with-single-mouse-click javascript - Select all DIV text with single mouse click - Stack Overflow] | |||
[[Category:Tool]] | [[Category:Tool]] | ||
[[Category:Data collecting]] | [[Category:Data collecting]] | ||
[[Category:Javascript]] | [[Category:Javascript]] | ||
Revision as of 15:12, 22 June 2016
Approach 1
Pros: Keep the original text format such as link, color
Cons: Copy and paste the content manually
Steps:
- Copy the HTML table manually
- Paste to Microsoft Excel or LibreOffice Calc
- Save file as CSV/Excel
Approach 2
Pros: The table content was imported automatically.
Cons: Text only. Losing the original text format such as link, color.
Steps:
- Go to Google Drive (Google 雲端硬碟)
- Add new spreadsheet
- Using IMPORTHTML function. Key in the content into the cell =IMPORTHTML("URL of HTML Table","table",1) 1 means the first table occurred on the web page
- Save file as CSV/Excel
Checking the rows of data after import
- Inspect the table element on the Chrome
- (optional) add tableId if the table has no id ex: <table ... ... id="tableId">
- If the table contains the <tbody> tag. Go to Chrome DevTools. Key in the following code to console.
var tableId = "tableId";
var rows = document.getElementById(tableId).getElementsByTagName("tbody")[0].getElementsByTagName("tr").length;
console.log("rows count of table which not contains heading: " + rows);
- And press the Enter key to get the rows of table.
- If the table contains the <tbody> tag. Key in the following code to console.
var tableId = "tableId";
var rows = document.getElementById(tableId).getElementsByTagName("tr").length;
console.log("rows count of table which contains heading: " + rows);
- And press the Enter key to get the rows of table.
使用 chrome 檢查表格資料列數
- 選取網頁語法,按右鍵 Edit as HTML
- 手動幫表格加上 id ex: <table ... ... id="tableId">
- 在網頁原始碼別處點一下,自動儲存修改
- 如果網頁表格包含 <tbody> tag,在 chrome 開發者工具的 console 輸入
var tableId = "tableId";
var rows = document.getElementById(tableId).getElementsByTagName("tbody")[0].getElementsByTagName("tr").length;
console.log("不包含表頭那一行的資料列數: " + rows);
- 按 Enter,取得資料列數
- 如果網頁表格不包含 <tbody> tag.,在 chrome 開發者工具的 console 輸入
var tableId = "tableId";
var rows = document.getElementById(tableId).getElementsByTagName("tr").length;
console.log("包含表頭那一行的資料列數: " + rows);
- 按 Enter,取得資料列數
References
further reading