Splitting of articles by article ids: Difference between revisions

From LemonWiki共筆
Jump to navigation Jump to search
mNo edit summary
 
(One intermediate revision by the same user not shown)
Line 14: Line 14:
</pre>
</pre>


== Case2: Articles IDs which mix of string and number ==
If the article ID consists of numbers or text e.g. [https://en.wikipedia.org/wiki/Amazon_Standard_Identification_Number Amazon Standard Identification Number], it is divided into two parts after a formula calculation with the remainder equal to 0 or 1, resulting in two parts of different quantities.
CONV(HEX(article_id), 16, 10) converts HEX(article_id) from hexadecimal (base 16) to decimal (base 10). Then, MOD(CONV(HEX(article_id), 16, 10), 2) computes the remainder when the converted decimal number is divided by 2.
(1) Create test data
<pre>
SET NAMES utf8mb4;
DROP TABLE IF EXISTS `strnum_article_ids`;
CREATE TABLE `strnum_article_ids` (
  `article_id` char(50) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
  `content` text CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_vietnamese_ci;
INSERT INTO `strnum_article_ids` (`article_id`, `content`) VALUES
('ap1p3le3', 'apple'),
('or2an3ge4', 'orange'),
('ba34na5na', 'banana'),
('wate4rle56mon', 'waterlemon'),
('ki78wi', 'kiwi');
</pre>
(2) Part 1
<pre>
SELECT *
FROM `strnum_article_ids`
WHERE MOD(CONV(HEX(article_id), 16, 10), 2) = 0;
</pre>
(3) Part 2
<pre>
SELECT *
FROM `strnum_article_ids`
WHERE MOD(CONV(HEX(article_id), 16, 10), 2) = 1;
</pre>


[[Category:Data_hygiene]]
[[Category:Data_hygiene]]
[[Category:Data Science]]
[[Category:Data Science]]
[[Category:MySQL]]
[[Category:MySQL]]

Latest revision as of 10:59, 13 December 2023


Case1: Articles Numbered IDs[edit]

If the article number is a digit, split the article into two parts based on a remainder of 0 or 1, resulting in two nearly equal halves.

(1) Part1

SELECT * FROM articles WHERE MOD(article_id, 2) = 0;

(2) Part2

SELECT * FROM articles WHERE MOD(article_id, 2) = 1;

Case2: Articles IDs which mix of string and number[edit]

If the article ID consists of numbers or text e.g. Amazon Standard Identification Number, it is divided into two parts after a formula calculation with the remainder equal to 0 or 1, resulting in two parts of different quantities.

CONV(HEX(article_id), 16, 10) converts HEX(article_id) from hexadecimal (base 16) to decimal (base 10). Then, MOD(CONV(HEX(article_id), 16, 10), 2) computes the remainder when the converted decimal number is divided by 2.

(1) Create test data

SET NAMES utf8mb4;

DROP TABLE IF EXISTS `strnum_article_ids`;
CREATE TABLE `strnum_article_ids` (
  `article_id` char(50) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
  `content` text CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_vietnamese_ci;

INSERT INTO `strnum_article_ids` (`article_id`, `content`) VALUES
('ap1p3le3',	'apple'),
('or2an3ge4',	'orange'),
('ba34na5na',	'banana'),
('wate4rle56mon',	'waterlemon'),
('ki78wi',	'kiwi');

(2) Part 1

SELECT *
FROM `strnum_article_ids`
WHERE MOD(CONV(HEX(article_id), 16, 10), 2) = 0;

(3) Part 2

SELECT *
FROM `strnum_article_ids`
WHERE MOD(CONV(HEX(article_id), 16, 10), 2) = 1;