At my work our collections management database (CMS) is exported into a MySQL dump. Due to the structure of the CMS multivalue fields from the tables are split into separate tables in the dump e.g. specimens gets split into specimens (single value fields only from the specimens table) and specimens_mv (multivalue fields from the specimens table). In our database you can link specimens that share data (e.g. collector number is the same for two specimens) via an associated material field. I’m looking for specimen records that have been inappropriately associated with one another:
SELECT sp.specimen_num, sp.barcode AS barcode_1, sp.coll_name AS collector_1, sp.coll_num AS coll_num_1, mv.nomen_type_kind as type_2, sp2.specimen_num AS specimen_num_2, sp2.barcode AS barcode_2, sp2.COLL_NAME AS coll_name_2, sp2.COLL_NUM AS coll_num_2, FROM specimens_mv AS mv JOIN specimens AS sp ON mv.specimen_num = sp.specimen_num JOIN specimens AS sp2 ON mv.ASSOCIATED_MAT_SPECIMEN_NUM = sp2.specimen_num where sp2.coll_num != sp.coll_num;
The problem is these pairwise associations are reciprocated and often there more than two specimens associated together e.g.
specimen_num 1 | specimen_num 2
Any suggestions to remove pairwise duplications would be greatly appreciated!