Data cleaning: Difference between revisions

Jump to navigation Jump to search
723 bytes added ,  2 November 2016
Line 295: Line 295:


* MySQL:  
* MySQL:  
** Using {{kbd | key =CONCAT}} for multiple column ex: column_1, column_2  
 
Find the duplicated data for one column<ref>[http://stackoverflow.com/questions/688549/finding-duplicate-values-in-mysql?rq=1 Finding duplicate values in MySQL - Stack Overflow]</ref>
<pre>
-- Generate test data.
CREATE TABLE `table_name` (
  `id` int(11) NOT NULL,
  `content` varchar(5) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
 
INSERT INTO `table_name` (`id`, `content`) VALUES
(1, 'apple'),
(2, 'lemon'),
(3, 'apple');
 
ALTER TABLE `table_name`
  ADD PRIMARY KEY (`id`);
 
-- Find duplicated data
SELECT `content`, COUNT(*) count FROM `table_name` GROUP BY `content` HAVING count > 1;
 
SELECT tmp.* FROM
(
  SELECT `content`, count(*) count FROM `table_name` GROUP BY `content`
) tmp
WHERE tmp.count >1;
</pre>
 
Using {{kbd | key =CONCAT}} for multiple column ex: column_1, column_2  
<pre>
<pre>
SELECT tmp.key FROM
SELECT tmp.key FROM

Navigation menu