Split text by symbol: Difference between revisions
Jump to navigation
Jump to search
m (→Approachs) |
|||
| (20 intermediate revisions by the same user not shown) | |||
| Line 1: | Line 1: | ||
Split text by symbol (another string, also called delimiter or separator) | Split text by symbol (another string, also called delimiter or separator) | ||
== Example data == | |||
Task: Split the position e.g. 25.040215, 121.512532 by the comma ( , ) to obtain the latitude (1st string) and longitude (2nd string). | |||
== Example == | |||
Task: Split 25.040215, 121.512532 by the comma ( , ) | |||
Expected output: | Expected output: | ||
| Line 14: | Line 8: | ||
* 121.512532 | * 121.512532 | ||
=== MySQL | == Approaches == | ||
list of equivalent approaches to split text by symbol | |||
* MySQL: [http://www.w3resource.com/mysql/string-functions/mysql-substring_index-function.php MySQL SUBSTRING_INDEX() function - w3resource] | |||
* Excel: (1) [https://support.microsoft.com/en-us/kb/214261 parse data] ([https://support.microsoft.com/zh-tw/kb/214261 資料剖析]) {{exclaim}} only allowed one character used as separator! (2) LEFT, RIGHT, FIND, LEN (3) MOD, FIND, LEN | |||
* PHP: [http://php.net/explode explode] or [http://php.net/manual/en/function.preg-split.php preg_split] function | |||
=== Split text by symbol in MySQL === | |||
{{exclaim}} Only one symbol was allowed. The following query will get the last element if there are more than two symbols. | |||
# to get the first string ( 25.040215 ): {{code | code=SELECT TRIM(SUBSTRING_INDEX('25.040215, 121.512532' ,',', <span style="color:red">1</span>))}} | # to get the first string ( 25.040215 ): {{code | code=SELECT TRIM(SUBSTRING_INDEX('25.040215, 121.512532' ,',', <span style="color:red">1</span>))}} | ||
# to get the second string ( 121.512532 ): {{code | code=SELECT TRIM(SUBSTRING_INDEX('25.040215, 121.512532' ,',', <span style="color:red">-1</span>))}} | # to get the second string ( 121.512532 ): {{code | code=SELECT TRIM(SUBSTRING_INDEX('25.040215, 121.512532' ,',', <span style="color:red">-1</span>))}} | ||
=== Excel Approach === | [http://sqlfiddle.com/#!9/d90dc/2/0 Live demo on SQL Fiddle] | ||
<pre> | |||
CREATE TABLE tbl(str varchar(50)); | |||
INSERT INTO tbl VALUES('25.040215, 121.512532'); | |||
INSERT INTO tbl VALUES('25.040215, 121.512532, 999.999'); /*bad data: 2 or more symbols*/ | |||
INSERT INTO tbl VALUES('123.456'); /*bad data: no symbol exists*/ | |||
SELECT TRIM(SUBSTRING_INDEX(str ,',', 1)), TRIM(SUBSTRING_INDEX(str ,',', -1)) FROM tbl WHERE str LIKE '%,%'; | |||
</pre> | |||
=== Split text by symbol in Excel === | |||
Comparison of approaches | |||
<table border="1" class="wikitable sortable"> | |||
<tr> | |||
<th>Approach</th> | |||
<th>Output</th> | |||
</tr> | |||
<tr> | |||
<td>Approach 1: Parse data</td> | |||
<td>'''Multiple''' parts will be returned if the content contains two or more separator characters.</td> | |||
</tr> | |||
<tr> | |||
<td>Approach 2: SUBSTRING_INDEX</td> | |||
<td>Only '''two''' parts will be returned if the content contains two or more separator characters.</td> | |||
</tr> | |||
</table> | |||
==== Approach 1: Parse data ==== | |||
[https://support.microsoft.com/en-us/kb/214261 parse data] ([https://support.microsoft.com/zh-tw/kb/214261 資料剖析]) | |||
* instruction with screenshots: [https://support.office.com/en-za/Article/split-text-into-different-cells-30b14928-5550-41f5-97ca-7a3e9c363ed7 Split text into different cells - Excel] | |||
==== Approach 2: SUBSTRING_INDEX ==== | |||
<table border="1" style=""> | <table border="1" style=""> | ||
<tr style="background-color: #555555; color: #ffffff;"> | <tr style="background-color: #555555; color: #ffffff;"> | ||
| Line 33: | Line 65: | ||
</tr> | </tr> | ||
</table> | </table> | ||
If the data is perfect | If the data set is perfect | ||
# to get the first string ( 25.040215 ): B1 = {{code | code=<span style="background-color: #F6CEF5">MID( A1, 1, FIND(" | # separator character: , | ||
# to get the first string ( 25.040215 ): B1 = {{code | code=<span style="background-color: #F6CEF5">MID( A1, 1, FIND(",", A1)-1)</span> }} | |||
# to get the second string ( 121.512532 ): C1 = {{code | code=<span style="background-color: #ccc">MID( A1, FIND(",", A1)+1, LEN(A1))</span> }} | # to get the second string ( 121.512532 ): C1 = {{code | code=<span style="background-color: #ccc">MID( A1, FIND(",", A1)+1, LEN(A1))</span> }} | ||
if the data was mixed with empty values. Return empty if the value is empty. | if the data set was mixed with empty values. Return empty if the value is empty. | ||
# to get the first string ( 25.040215 ): B1 = {{code | code=IF(ISERROR(<span style="background-color: #F6CEF5">MID(A1, 1, FIND(",", A1)-1)</span>), "", <span style="background-color: #F6CEF5">MID(A1, 1, FIND(",", A1)-1)</span>) }} | # to get the first string ( 25.040215 ): B1 = {{code | code=IF(ISERROR(<span style="background-color: #F6CEF5">MID(A1, 1, FIND(",", A1)-1)</span>), "", <span style="background-color: #F6CEF5">MID(A1, 1, FIND(",", A1)-1)</span>) }} | ||
# to get the second string ( 121.512532 ): C1 = {{code | code=IF(ISERROR(TRIM(<span style="background-color: #ccc">MID(A1, FIND(",", A1)+1, LEN(A1))</span>)), "", TRIM(MID(<span style="background-color: #ccc">A1, FIND(",", A1)+1, LEN(A1))</span>) }} | # to get the second string ( 121.512532 ): C1 = {{code | code=IF(ISERROR(TRIM(<span style="background-color: #ccc">MID(A1, FIND(",", A1)+1, LEN(A1))</span>)), "", TRIM(MID(<span style="background-color: #ccc">A1, FIND(",", A1)+1, LEN(A1))</span>) }} | ||
| Line 43: | Line 76: | ||
[https://docs.google.com/spreadsheets/d/11VeZIxGR7orL4smf_0G6oi7VxxNi3FFsCL9b6QrLhBE/edit?usp=sharingc online demo] (allow to edit) | [https://docs.google.com/spreadsheets/d/11VeZIxGR7orL4smf_0G6oi7VxxNi3FFsCL9b6QrLhBE/edit?usp=sharingc online demo] (allow to edit) | ||
< | |||
=== Split text by symbol in PHP === | |||
PHP script | |||
<pre> | |||
$input_string = "25.040215, 121.512532"; | |||
$list1 = explode(",", $input_string); | |||
$trimmed_list1 = array_map('trim', $list1); | |||
$list2 = preg_split("/[\s,]+/", $input_string); | |||
print_r($trimmed_list1); | |||
print_r($list2); | |||
</pre> | |||
output | |||
<pre> | |||
Array ( | |||
[0] => 25.040215 | |||
[1] => 121.512532 | |||
) | |||
</pre> | |||
[[Category:Data transformation]] | [[Category:Data transformation]] | ||
Latest revision as of 14:29, 28 September 2023
Split text by symbol (another string, also called delimiter or separator)
Example data[edit]
Task: Split the position e.g. 25.040215, 121.512532 by the comma ( , ) to obtain the latitude (1st string) and longitude (2nd string).
Expected output:
- 25.040215
- 121.512532
Approaches[edit]
list of equivalent approaches to split text by symbol
- MySQL: MySQL SUBSTRING_INDEX() function - w3resource
- Excel: (1) parse data (資料剖析)
only allowed one character used as separator! (2) LEFT, RIGHT, FIND, LEN (3) MOD, FIND, LEN - PHP: explode or preg_split function
Split text by symbol in MySQL[edit]
Only one symbol was allowed. The following query will get the last element if there are more than two symbols.
- to get the first string ( 25.040215 ):
SELECT TRIM(SUBSTRING_INDEX('25.040215, 121.512532' ,',', 1)) - to get the second string ( 121.512532 ):
SELECT TRIM(SUBSTRING_INDEX('25.040215, 121.512532' ,',', -1))
CREATE TABLE tbl(str varchar(50));
INSERT INTO tbl VALUES('25.040215, 121.512532');
INSERT INTO tbl VALUES('25.040215, 121.512532, 999.999'); /*bad data: 2 or more symbols*/
INSERT INTO tbl VALUES('123.456'); /*bad data: no symbol exists*/
SELECT TRIM(SUBSTRING_INDEX(str ,',', 1)), TRIM(SUBSTRING_INDEX(str ,',', -1)) FROM tbl WHERE str LIKE '%,%';
Split text by symbol in Excel[edit]
Comparison of approaches
| Approach | Output |
|---|---|
| Approach 1: Parse data | Multiple parts will be returned if the content contains two or more separator characters. |
| Approach 2: SUBSTRING_INDEX | Only two parts will be returned if the content contains two or more separator characters. |
Approach 1: Parse data[edit]
- instruction with screenshots: Split text into different cells - Excel
Approach 2: SUBSTRING_INDEX[edit]
| A | B | C | |
|---|---|---|---|
| 1 | 25.040215, 121.512532 | 25.040215 |
121.512532 |
If the data set is perfect
- separator character: ,
- to get the first string ( 25.040215 ): B1 =
MID( A1, 1, FIND(",", A1)-1) - to get the second string ( 121.512532 ): C1 =
MID( A1, FIND(",", A1)+1, LEN(A1))
if the data set was mixed with empty values. Return empty if the value is empty.
- to get the first string ( 25.040215 ): B1 =
IF(ISERROR(MID(A1, 1, FIND(",", A1)-1)), "", MID(A1, 1, FIND(",", A1)-1)) - to get the second string ( 121.512532 ): C1 =
IF(ISERROR(TRIM(MID(A1, FIND(",", A1)+1, LEN(A1)))), "", TRIM(MID(A1, FIND(",", A1)+1, LEN(A1)))
online demo (allow to edit)
Split text by symbol in PHP[edit]
PHP script
$input_string = "25.040215, 121.512532";
$list1 = explode(",", $input_string);
$trimmed_list1 = array_map('trim', $list1);
$list2 = preg_split("/[\s,]+/", $input_string);
print_r($trimmed_list1);
print_r($list2);
output
Array ( [0] => 25.040215 [1] => 121.512532 )