CSVファイルをXMLファイルに変換、XSDで確認

こんにちは。

今回はCSVファイルをXMLファイルに変換する方法を紹介します。

CSVファイル
XSDファイル
PYTHONコード
XMLファイル

CSVファイル

下記のCSVファイルを使用します。1行目はヘッダーになります。

title,author,publication_date,isbn,publisher,country,genre
小説①,物書太郎,1955-05-01,222-222,本屋A,アメリカ,喜劇
小説②,物書少女,2001-01-01,333-2222,本屋B,イギリス,恋愛
小説③,物書少年,1999-08-30,244-222,本屋C,日本,フィクション

XSDファイル

XMLのスキーマとしてXSDファイルを使用します。作成したXMLファイルがXSDの記述に従っているかを確認する為に使用します。

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <xs:element name="books">
    <xs:complexType>
      <xs:sequence>
        <xs:element name="book" maxOccurs="unbounded">
          <xs:complexType>
            <xs:sequence>
              <xs:element name="title" type="xs:string"/>
              <xs:element name="author" type="xs:string"/>
              <xs:element name="publication_date" type="xs:date"/>
              <xs:element name="isbn" type="xs:string"/>
              <xs:element name="publish">
                <xs:complexType>
                  <xs:sequence>
                    <xs:element name="publisher" type="xs:string"/>
                    <xs:element name="country" type="xs:string"/>
                  </xs:sequence>
                </xs:complexType>
              </xs:element>
              <xs:element name="genre" type="xs:string"/>
            </xs:sequence>
          </xs:complexType>
        </xs:element>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
</xs:schema>

PYTHONコード

import csv
import xml.etree.ElementTree as ET
import xmlschema

csv_file_path = "book.csv"

xsd_file_path = "book.xsd"

with open(csv_file_path, "r", encoding="UTF-8") as f:
    csv_data = csv.DictReader(f)
    print(type(csv_data))
    rows = [row for row in csv_data]
    print(type(rows))

schema = xmlschema.XMLSchema(xsd_file_path)

root = ET.Element("books")

for row in rows:
    print(row)
    child = ET.SubElement(root, "book")
    title = ET.SubElement(child, "title")
    title.text = row["title"]
    author = ET.SubElement(child, "author")
    author.text = row["author"]
    publication_date = ET.SubElement(child, "publication_date")
    publication_date.text = row["publication_date"]
    isbn = ET.SubElement(child, "isbn")
    isbn.text = row["isbn"]
    publish = ET.SubElement(child, "publish")
    publisher = ET.SubElement(publish, "publisher")
    publisher.text = row["publisher"]
    country = ET.SubElement(publish, "country")
    country.text = row["country"]
    genre = ET.SubElement(child, "genre")
    genre.text = row["genre"]

validation_error = schema.validate(root)

if validation_error:
    print(validation_error)
else:
    ET.ElementTree(root).write("books.xml", encoding="utf-8", xml_declaration=True)

validation_error = schema.validate(root) でXSDファイルとの整合性を取っています。

XMLファイル

作成されたXMLファイルは以下になります。

<books>
    <book>
        <title>小説①</title>
        <author>物書太郎</author>
        <publication_date>1955-05-01</publication_date>
        <isbn>222-222</isbn>
        <publish>
            <publisher>本屋A</publisher>
            <country>アメリカ</country>
        </publish>
        <genre>喜劇</genre>
    </book>
    <book>
        <title>小説②</title>
        <author>物書少女</author>
        <publication_date>2001-01-01</publication_date>
        <isbn>333-2222</isbn>
        <publish>
            <publisher>本屋B</publisher>
            <country>イギリス</country>
        </publish>
        <genre>恋愛</genre>
    </book>
    <book>
        <title>小説③</title>
        <author>物書少年</author>
        <publication_date>1999-08-30</publication_date>
        <isbn>244-222</isbn>
        <publish>
            <publisher>本屋C</publisher>
            <country>日本</country>
        </publish>
        <genre>フィクション</genre>
    </book>
</books>

XMLマスター教科書プロフェッショナル(データベース)

翔泳社