Convert HTML Table To CSV/Excel: Difference between revisions

From LemonWiki共筆
Jump to navigation Jump to search
Line 26: Line 26:
* Inspect the table element on the {{Chrome}}
* Inspect the table element on the {{Chrome}}
* (optional) add tableId if the table has no id ex: {{kbd | key =<nowiki><table ... ... id="tableId"></nowiki>}}
* (optional) add tableId if the table has no id ex: {{kbd | key =<nowiki><table ... ... id="tableId"></nowiki>}}
* If the table contains the <tbody> tag. Go to [https://developer.chrome.com/devtools Chrome DevTools]. Key in the following code to [https://developer.chrome.com/devtools/docs/console console].
* check the count number of rows:
<pre> var tableId = "tableId";  
* If the table contains the <tbody> tag. Go to [https://developer.chrome.com/devtools Chrome DevTools]. Key in the following code to [https://developer.chrome.com/devtools/docs/console console].  
<pre>// If the table contains the <tbody> tag
var tableId = "tableId";  
  var rows = document.getElementById(tableId).getElementsByTagName("tbody")[0].getElementsByTagName("tr").length;
  var rows = document.getElementById(tableId).getElementsByTagName("tbody")[0].getElementsByTagName("tr").length;
  console.log("rows count of table which not contains heading: " + rows);
  console.log("rows count of table which not contains heading: " + rows);
// If the table NOT contains the <tbody> tag
var tableId = "tableId";
var rows = document.getElementById(tableId).getElementsByTagName("tr").length;
console.log("rows count of table which contains heading: " + rows);
</pre>
</pre>
* And press the {{kbd | key =<nowiki>Enter</nowiki>}} key to get the rows of table.
* And press the {{kbd | key =<nowiki>Enter</nowiki>}} key to get the rows of table.
 
* check the count number of column: <ref>[http://stackoverflow.com/questions/10043760/javascript-count-number-of-columns-in-a-table-row html - Javascript: Count number of columns in a table row - Stack Overflow]</ref>
* If the table contains the <tbody> tag. Key in the following code to console.
<pre> var tableId = "tableId";  
<pre> var tableId = "tableId";  
  var rows = document.getElementById(tableId).getElementsByTagName("tr").length;
  var column_count = document.getElementById(tableId).rows[0].cells.length;
  console.log("rows count of table which contains heading: " + rows);
  console.log("column count of table: " + rows);
</pre>
</pre>
* And press the {{kbd | key =<nowiki>Enter</nowiki>}} key to get the rows of table.
* And press the {{kbd | key =<nowiki>Enter</nowiki>}} key to get the rows of table.
Line 46: Line 52:
* 手動幫表格加上 id ex: {{kbd | key =<nowiki><table ... ... id="tableId"></nowiki>}}
* 手動幫表格加上 id ex: {{kbd | key =<nowiki><table ... ... id="tableId"></nowiki>}}
* 在網頁原始碼別處點一下,自動儲存修改
* 在網頁原始碼別處點一下,自動儲存修改
* 如果網頁表格包含 <tbody> tag,在 chrome 開發者工具的 console 輸入
* 檢查資料列數
<pre> var tableId = "tableId";  
<pre> // 如果網頁表格包含 <tbody> tag,在 chrome 開發者工具的 console 輸入
var tableId = "tableId";  
  var rows = document.getElementById(tableId).getElementsByTagName("tbody")[0].getElementsByTagName("tr").length;
  var rows = document.getElementById(tableId).getElementsByTagName("tbody")[0].getElementsByTagName("tr").length;
  console.log("不包含表頭那一行的資料列數: " + rows);
  console.log("不包含表頭那一行的資料列數: " + rows);
</pre>
* 按 {{kbd | key =<nowiki>Enter</nowiki>}},取得資料列數


* 如果網頁表格不包含 <tbody> tag.,在 chrome 開發者工具的 console 輸入
// 如果網頁表格不包含 <tbody> tag.,在 chrome 開發者工具的 console 輸入
<pre> var tableId = "tableId";  
var tableId = "tableId";  
  var rows = document.getElementById(tableId).getElementsByTagName("tr").length;
  var rows = document.getElementById(tableId).getElementsByTagName("tr").length;
  console.log("包含表頭那一行的資料列數: " + rows);
  console.log("包含表頭那一行的資料列數: " + rows);
</pre>
</pre>
* 按 {{kbd | key =<nowiki>Enter</nowiki>}},取得資料列數
* 按 {{kbd | key =<nowiki>Enter</nowiki>}},取得資料列數
* 檢查資料欄位數
<pre> var tableId = "tableId";
var column_count = document.getElementById(tableId).rows[0].cells.length;
console.log("資料欄數: " + rows);
</pre>
* 按 {{kbd | key =<nowiki>Enter</nowiki>}},取得資料欄位數


== References ==
== References ==

Revision as of 11:06, 28 June 2016


Approach 1

Pros: Keep the original text format such as link, color

Cons: Copy and paste the content manually

Steps:

  1. Copy the HTML table manually
  2. Paste to Microsoft Excel or LibreOffice Calc
  3. Save file as CSV/Excel


Approach 2

Pros: The table content was imported automatically.

Cons: Text only. Losing the original text format such as link, color.

Steps:

  1. Go to Google Drive (Google 雲端硬碟)
  2. Add new spreadsheet
  3. Using IMPORTHTML function. Key in the content into the cell =IMPORTHTML("URL of HTML Table","table",1) 1 means the first table occurred on the web page
  4. Save file as CSV/Excel

Checking the rows of data after import

  • Inspect the table element on the Chrome Browser chrome.png
  • (optional) add tableId if the table has no id ex: <table ... ... id="tableId">
  • check the count number of rows:
  • If the table contains the <tbody> tag. Go to Chrome DevTools. Key in the following code to console.
// If the table contains the <tbody> tag
 var tableId = "tableId"; 
 var rows = document.getElementById(tableId).getElementsByTagName("tbody")[0].getElementsByTagName("tr").length;
 console.log("rows count of table which not contains heading: " + rows);

// If the table NOT contains the <tbody> tag
 var tableId = "tableId"; 
 var rows = document.getElementById(tableId).getElementsByTagName("tr").length;
 console.log("rows count of table which contains heading: " + rows);
  • And press the Enter key to get the rows of table.
  • check the count number of column: [1]
 var tableId = "tableId"; 
 var column_count = document.getElementById(tableId).rows[0].cells.length;
 console.log("column count of table: " + rows);
  • And press the Enter key to get the rows of table.


使用 chrome 檢查表格資料列數

  • 選取網頁語法,按右鍵 Edit as HTML
  • 手動幫表格加上 id ex: <table ... ... id="tableId">
  • 在網頁原始碼別處點一下,自動儲存修改
  • 檢查資料列數
 // 如果網頁表格包含 <tbody> tag,在 chrome 開發者工具的 console 輸入
 var tableId = "tableId"; 
 var rows = document.getElementById(tableId).getElementsByTagName("tbody")[0].getElementsByTagName("tr").length;
 console.log("不包含表頭那一行的資料列數: " + rows);

// 如果網頁表格不包含 <tbody> tag.,在 chrome 開發者工具的 console 輸入
 var tableId = "tableId"; 
 var rows = document.getElementById(tableId).getElementsByTagName("tr").length;
 console.log("包含表頭那一行的資料列數: " + rows);
  • Enter,取得資料列數
  • 檢查資料欄位數
 var tableId = "tableId"; 
 var column_count = document.getElementById(tableId).rows[0].cells.length;
 console.log("資料欄數: " + rows);
  • Enter,取得資料欄位數

References

further reading