# Formatting/displaying JMA XML data in Julia
(JMA is Japan Meteorological Agency.)
# Motivation
To solve an assignment to retrieve XML data using REST in a university class about database.
# Overview
- Get a JMA Atom feed in JMA XML format using
HTTP.jl
- Show the XML data using
EzXML.jl
# Main
# The data contents
This following is a part of data.
The XML Data
<feed xmlns="http://www.w3.org/2005/Atom" lang="ja">
<title>長期(定時)</title>
<subtitle>JMAXML publishing feed</subtitle>
<updated>2021-05-15T23:12:06+09:00</updated>
<id>http://www.data.jma.go.jp/developer/xml/feed/regular.xml#long_1621087926</id>
<link rel="related" href="http://www.jma.go.jp/"/>
<link rel="self" href="http://www.data.jma.go.jp/developer/xml/feed/regular.xml"/>
<link rel="hub" href="http://alert-hub.appspot.com/"/>
<rights type="html">
<![CDATA[ <a href="http://www.jma.go.jp/jma/kishou/info/coment.html">利用規約</a>, <a href="http://www.jma.go.jp/jma/en/copyright.html">Terms of Use</a> ]]>
</rights>
<entry>
<title>大雨危険度通知</title>
<id>http://www.data.jma.go.jp/developer/xml/data/20210515141010_0_VPRN50_010000.xml</id>
<updated>2021-05-15T14:09:42Z</updated>
<author>
<name>気象庁</name>
</author>
<link type="application/xml" href="http://www.data.jma.go.jp/developer/xml/data/20210515141010_0_VPRN50_010000.xml"/>
<content type="text">【大雨危険度通知】</content>
</entry>
<entry>
<title>地上実況図</title>
<id>http://www.data.jma.go.jp/developer/xml/data/20210515140100_0_VZSA50_010000.xml</id>
<updated>2021-05-15T14:00:49Z</updated>
<author>
<name>気象庁</name>
</author>
<link type="application/xml" href="http://www.data.jma.go.jp/developer/xml/data/20210515140100_0_VZSA50_010000.xml"/>
<content type="text">【地上実況図】</content>
</entry>
<entry>
...
# import Package
import Pkg
using Pkg
Pkg.add("HTTP");using HTTP
Pkg.add("EzXML");using EzXML
If you're using Jupyter, you don't need to do using XXX
after loading it once. If you write on .jl
file and execute it in CLI,
Package loading run everytime because the session switchs in every time and you will feel frustrated every time you run it. It is said that one of the solution seems to be to use PackageCompiler.jl (opens new window).
Reference:
PackageCompiler.jl で Plots の呼び出しを高速化する 2020 年 7 月版 - Qiita (opens new window)
PackageCompiler.jl で Makie.jl の呼び出しを速くする - Qiita (opens new window)
# Get an Atom feed with HTTP Request
Using HTTP.jl
.
req = HTTP.request("GET", "http://www.data.jma.go.jp/developer/xml/feed/regular_l.xml")
# status code
println(req.status)
# The contents
println(String(req.body))
# Analysis XML Data with EzXML.jl
Loading XML Data
The data format syntax XML file readxml(your_xml_file)
XML String parsexml(""" <feed xmlns=" .... """)
doc = parsexml(String(req.body)) primates = root(doc)
EzXML.jl Type The types in EzXML.jl is mainly
EzXML.Document
andEzXML.Node
(At least, in this post.).
Take root in XML Data
doc = parsexml(String(req.body)) primates = root(doc)
The type of
doc
isEzXML.Document
.root(doc)
makes you analysis the XML data contents because its type isEzXML.Node
.The Analysis of Node
Now, the above
primates
is of typeEzXML.Node
, is an iterator, and is root Node ofdoc
of typeEzXML.Document
. What you wants to analysis is the child nodes ofprimates
, so the child nodes is taken by usingeachelement()
as below.
for genus in eachelement(primates)
println(genus)
# Get an attribute value by name.
end
Thereby,
<title>高頻度(定時)</title>
<subtitle>JMAXML publishing feed</subtitle>
<updated>2021-06-03T02:12:07+09:00</updated>
<id>http://www.data.jma.go.jp/developer/xml/feed/regular.xml#short_1622653927</id>
<link rel="related" href="http://www.jma.go.jp/"/>
<link rel="self" href="http://www.data.jma.go.jp/developer/xml/feed/regular.xml"/>
<link rel="hub" href="http://alert-hub.appspot.com/"/>
<rights type="html"><![CDATA[
<a href="http://www.jma.go.jp/jma/kishou/info/coment.html">利用規約</a>,
<a href="http://www.jma.go.jp/jma/en/copyright.html">Terms of Use</a>
]]></rights>
<entry>
<title>大雨危険度通知</title>
<id>http://www.data.jma.go.jp/developer/xml/data/20210602171020_0_VPRN50_010000.xml</id>
<updated>2021-06-02T17:09:44Z</updated>
<author>
<name>気象庁</name>
</author>
<link type="application/xml" href="http://www.data.jma.go.jp/developer/xml/data/20210602171020_0_VPRN50_010000.xml"/>
<content type="text">【大雨危険度通知】</content>
</entry>
...
output the contents of all hierarchy below genus
.
Arrange and print the Node contents
To check the hierarchy of
<title>
,<id>
,<updated>
, etc. in<entry>
, you can nest them withfor
. And If you want to output only Node name, use.name
as follows.If the data is
<title>大雨危険度通知</title>
like below.
primates = root(parsexml("""<title>大雨危険度通知</title>""")) println(primates.name ,":", primates.content) # > title:大雨危険度通知
e.g.:
for genus in eachelement(primates) # Get an attribute value by name. if genus.name != "entry" println("--- ", genus.name, ": ", genus.content) end for species in eachelement(genus) println(" └-- ", species.name, ": ", species.content, "") end println("---------------------------------------") end
--- title: 高頻度(定時) --------------------------------------- --- subtitle: JMAXML publishing feed --------------------------------------- --- updated: 2021-06-03T02:22:06+09:00 --------------------------------------- --- id: http://www.data.jma.go.jp/developer/xml/feed/regular.xml#short_1622654526 --------------------------------------- --- link: --------------------------------------- --- link: --------------------------------------- --- link: --------------------------------------- --- rights: <a href="http://www.jma.go.jp/jma/kishou/info/coment.html">利用規約</a>, <a href="http://www.jma.go.jp/jma/en/copyright.html">Terms of Use</a> --------------------------------------- └-- title: 大雨危険度通知 └-- id: http://www.data.jma.go.jp/developer/xml/data/20210602172011_0_VPRN50_010000.xml └-- updated: 2021-06-02T17:19:42Z └-- author: 気象庁 └-- link: └-- content: 【大雨危険度通知】 ...
print href attribute of each Node
The fetched data above is lost of some attributes like
href
of<link>
etc. If you want to get an attribute (e.g.href
), like dogenus["href"]
you get attributes as a string.... <link type="application/xml" href="http://www.data.jma.go.jp/developer/xml/data/20210602171020_0_VPRN50_010000.xml"/> ...
do so, you get as below
primates = root(parsexml("""<link type="application/xml" href="http://www.data.jma.go.jp/developer/xml/data/20210602171020_0_VPRN50_010000.xml"/>""")) println(primates["href"]) # > http://www.data.jma.go.jp/developer/xml/data/20210602171020_0_VPRN50_010000.xml
e.g.:
for genus in eachelement(primates) # Get an attribute value by name. if genus.name != "entry" && genus.name != "link" println("--- ", genus.name, ": ", genus.content) elseif genus.name == "link" println("--- ", genus.name, ": ", genus["href"]) end for species in eachelement(genus) if species.name == "link" println("--- ", species.name, ": ", species["href"]) else println(" └-- ", species.name, ": ", species.content, "") end end println("---------------------------------------") end
... --- id: http://www.data.jma.go.jp/developer/xml/feed/regular.xml#short_1622655067 --------------------------------------- --- link: http://www.jma.go.jp/ --------------------------------------- --- link: http://www.data.jma.go.jp/developer/xml/feed/regular.xml --------------------------------------- --- link: http://alert-hub.appspot.com/ --------------------------------------- ... --------------------------------------- └-- title: 大雨危険度通知 └-- id: http://www.data.jma.go.jp/developer/xml/data/20210602173000_0_VPRN50_010000.xml └-- updated: 2021-06-02T17:29:38Z └-- author: 気象庁 --- link: http://www.data.jma.go.jp/developer/xml/data/20210602173000_0_VPRN50_010000.xml └-- content: 【大雨危険度通知】 --------------------------------------- ...
The other function in EzXML, for instance, hasnode(node)
that judgment whether the node has a child node and istext(node)
that judgment whether the node is type of string. The details is in Reference · EzXML.jl (opens new window).
To wrap it up,
import Pkg;using Pkg
Pkg.add("HTTP");using HTTP
Pkg.add("EzXML");using EzXML
req = HTTP.request("GET", "http://www.data.jma.go.jp/developer/xml/feed/regular_l.xml")
doc = parsexml(String(req.body))
primates = root(doc)
id_name = ""
id_content = ""
println("\n\n\n\n===== XML =====")
for genus in eachelement(primates)
# Get an attribute value by name.
if genus.name != "entry" && genus.name != "link"
println("--- ", genus.name, ": ", genus.content)
elseif genus.name == "link"
println("--- ", genus.name, ": ", genus["href"])
end
for species in eachelement(genus)
if species.content != "" && species.name != "id"
println(" └-- ", species.name, ": ", species.content, "")
elseif species.name == "id"
global id_name = deepcopy(species.name)
global id_content = deepcopy(species.content)
end
end
if id_name != ""
println(" └-- ", id_name, ": ", id_content, "")
global id_name = ""
global id_content = ""
end
println("---------------------------------------")
end
output:
===== XML =====
--- title: 高頻度(定時)
---------------------------------------
--- subtitle: JMAXML publishing feed
---------------------------------------
--- updated: 2021-06-03T03:02:07+09:00
---------------------------------------
--- id: http://www.data.jma.go.jp/developer/xml/feed/regular.xml#short_1622656927
---------------------------------------
--- link: http://www.jma.go.jp/
---------------------------------------
--- link: http://www.data.jma.go.jp/developer/xml/feed/regular.xml
---------------------------------------
--- link: http://alert-hub.appspot.com/
---------------------------------------
--- rights:
<a href="http://www.jma.go.jp/jma/kishou/info/coment.html">利用規約</a>,
<a href="http://www.jma.go.jp/jma/en/copyright.html">Terms of Use</a>
---------------------------------------
└-- title: 大雨危険度通知
└-- updated: 2021-06-02T17:59:40Z
└-- author:
気象庁
└-- content: 【大雨危険度通知】
└-- id: http://www.data.jma.go.jp/developer/xml/data/20210602180010_0_VPRN50_010000.xml
--- title: 高頻度(定時)
--- subtitle: JMAXML publishing feed
--- updated: 2021-06-03T03:02:07+09:00
--- id: http://www.data.jma.go.jp/developer/xml/feed/regular.xml#short_1622656927
---------------------------------------
└-- title: 大雨危険度通知
└-- updated: 2021-06-02T17:49:37Z
└-- author:
気象庁
└-- id: http://www.data.jma.go.jp/developer/xml/data/20210602175001_0_VPRN50_010000.xml
--- title: 高頻度(定時)
--- subtitle: JMAXML publishing feed
--- updated: 2021-06-03T03:02:07+09:00
--- id: http://www.data.jma.go.jp/developer/xml/feed/regular.xml#short_1622656927
---------------------------------------
...
# Reference
JuliaIO/EzXML.jl - github (opens new window) EzXML.jl を作った話 - りんごがでている - hatenablog (opens new window)
PackageCompiler.jl で Plots の呼び出しを高速化する 2020 年 7 月版 - Qiita (opens new window)
PackageCompiler.jl で Makie.jl の呼び出しを速くする - Qiita (opens new window)