如何获取一个域名对应的所有IP地址 | winway’s blog

如何获取一个域名对应的所有IP地址 | winway’s blog

  • 1、问题
  • 2、思路
  • 3、实现方案
    • shell + dig版本
    • shell + dig分布式版本
    • golang发包版本

1、问题

给定一个域名,如何取得该域名对应的所有IP地址。

2、思路

到全球所有DNS服务器上查询。但是我们很难得到全球所有的DNS服务器,所以只能尽量收集DNS服务器,解析出尽量多的IP地址。

  • 获取全球DNS服务器。可以通过https://public-dns.info/获得,虽然不敢保证是全球所有的,但是全球各个城市23000+的DNS服务器也差不多了
  • 针对每台DNS服务器发送DNS查询请求,获取IP地址信息

3、实现方案

这里以解析某个域名的AAAA记录为例,代码:https://github.com/winway/domain2IP

shell + dig版本

Linux下有一个很强大的域名解析命令——dig。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
# dig -t AAAA +noquestion +noadditional +noauthority @207.234.132.214 www.who.int

; <<>> DiG 9.9.4-RedHat-9.9.4-50.el7_3.1 <<>> -t AAAA +noquestion +noadditional +noauthority @207.234.132.214 www.who.int
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 2516
;; flags: qr rd ra; QUERY: 1, ANSWER: 9, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1280
;; ANSWER SECTION:
www.who.int. 300 IN CNAME d334guqhelsxtv.cloudfront.net.
d334guqhelsxtv.cloudfront.net. 60 IN AAAA 2600:9000:2004:ca00:16:b115:d0c0:21
d334guqhelsxtv.cloudfront.net. 60 IN AAAA 2600:9000:2004:800:16:b115:d0c0:21
d334guqhelsxtv.cloudfront.net. 60 IN AAAA 2600:9000:2004:4e00:16:b115:d0c0:21
d334guqhelsxtv.cloudfront.net. 60 IN AAAA 2600:9000:2004:0:16:b115:d0c0:21
d334guqhelsxtv.cloudfront.net. 60 IN AAAA 2600:9000:2004:7400:16:b115:d0c0:21
d334guqhelsxtv.cloudfront.net. 60 IN AAAA 2600:9000:2004:2000:16:b115:d0c0:21
d334guqhelsxtv.cloudfront.net. 60 IN AAAA 2600:9000:2004:a000:16:b115:d0c0:21
d334guqhelsxtv.cloudfront.net. 60 IN AAAA 2600:9000:2004:b800:16:b115:d0c0:21

;; Query time: 783 msec
;; SERVER: 207.234.132.214#53(207.234.132.214)
;; WHEN: Thu Jun 14 16:14:35 CST 2018
;; MSG SIZE rcvd: 307

我们直接使用dig命令实现域名解析,用shell并发的调用dig命令解析并聚合结果。这里为了防止负载过高,控制了一下并发度。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
# cat getIP.sh
#! /bin/bash
#

CONFDIR=./conf/
RESULTDIR=./result/

if [[ $# -ne 1 ]]
then
echo "Usage: sh $0 <url>"
exit
fi
url=$1

nameservers=($(cat ${CONFDIR}/nameservers.txt | grep '[0-9]*\.[0-9]*\.[0-9]*\.[0-9]*' | sort -u))

N=300

ts=$(date '+%s')

for dns_server in ${nameservers[*]}
do
{ dig -t AAAA +noquestion +noadditional +noauthority @$dns_server $url | awk '/IN[ \t]+AAAA/{print $NF}' > /tmp/.${url}.${ts}.${dns_server}.txt; echo "dig @$dns_server $url done"; } &

joblist=($(jobs -p))
while (( ${#joblist[*]} > $N ))
do
echo "######## rest for a while ########"
sleep 0.1
joblist=($(jobs -p))
done
done

wait

cat /tmp/.${url}.${ts}.*.txt | sort -u >${RESULTDIR}/${url}.ip.txt
find /tmp/ -name ".${url}.${ts}.*.txt" | xargs rm -f

运行情况如下,耗时4分多钟,解析出9000+IP地址

1
2
3
4
5
6
7
# time sh getIP.sh www.who.int

real 4m10.336s
user 4m12.732s
sys 3m55.625s
# wc -l result/www.who.int.ip.txt
9862 result/www.who.int.ip.txt

shell + dig分布式版本

该版本是上个版本的分布式版本,核心功能相同,只是将DNS服务器列表切分为多份,分发到多台机器上同时处理,然后将结果取回合并。
该脚本负责切分任务、分发任务,收集结果

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
# cat getIP_distributed.sh
#! /bin/bash
#

CONFDIR=./conf/
RESULTDIR=./result/
TMPDIR=./tmp/

if [[ $# -ne 1 ]]
then
echo "Usage: sh $0 <url>"
exit 1
fi
url=$1

# get work node
work_nodes=($(cat ${CONFDIR}/work_nodes.txt | grep '[0-9]*\.[0-9]*\.[0-9]*\.[0-9]*' | sort -u))
work_nodes_num=${#work_nodes[@]}

# split nameservers
nameservers=($(cat ${CONFDIR}/nameservers.txt | grep '[0-9]*\.[0-9]*\.[0-9]*\.[0-9]*' | sort -u))
nameservers_num=${#nameservers[@]}
chunk=$(($nameservers_num/$work_nodes_num+1))

work_node_index=0
counter=1
: > ${TMPDIR}/${work_nodes[$work_node_index]}_nameservers.txt
for nameserver in ${nameservers[@]}
do
if (($counter > $chunk))
then
((work_node_index++))
counter=1
: > ${TMPDIR}/${work_nodes[$work_node_index]}_nameservers.txt
fi
echo "$nameserver" >> ${TMPDIR}/${work_nodes[$work_node_index]}_nameservers.txt
((counter++))
done

# distribute and run
for node in ${work_nodes[@]}
do
echo "$node, count: $(wc -l ${TMPDIR}/${node}_nameservers.txt)"
{ scp job.sh ${TMPDIR}/${node}_nameservers.txt $node:/tmp/ && ssh $node "sh /tmp/job.sh $node $url" && scp $node:/tmp/${node}_ip.txt $TMPDIR && echo "[$(date)] $node done"; } &
done

wait

for node in ${work_nodes[@]}
do
cat $TMPDIR/${node}_ip.txt >> $TMPDIR/${url}_ip.txt
done

sort -u $TMPDIR/${url}_ip.txt >${RESULTDIR}/${url}_ip.txt

该脚本负责解析,与方案1几乎一样。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
# cat job.sh
#! /bin/bash
#

SCRIPT=$(readlink -f "$BASH_SOURCE")

cd $(dirname $SCRIPT) || { echo "cd $(dirname $SCRIPT) failed"; exit 1; }

node=$1
url=$2

nameservers=($(cat ${node}_nameservers.txt | grep '[0-9]*\.[0-9]*\.[0-9]*\.[0-9]*' | sort -u))

N=300

ts=$(date '+%s')

for nameserver in ${nameservers[*]}
do
{ dig -t AAAA +noquestion +noadditional +noauthority @$nameserver $url | awk '/IN[ \t]+AAAA/{print $NF}' > /tmp/.${url}.${ts}.${nameserver}.txt; } &

joblist=($(jobs -p))
while (( ${#joblist[*]} > $N ))
do
echo "######## rest for a while ########"
sleep 0.1
joblist=($(jobs -p))
done
done

wait

cat /tmp/.${url}.${ts}.*.txt | sort -u >${node}_ip.txt
find /tmp/ -name ".${url}.*.txt" | xargs rm -f

运行情况如下,同时在3台机器上执行,耗时1分多钟,解析出9000+IP地址

1
2
3
4
5
6
7
8
# time sh getIP_distributed.sh www.who.int

real 1m43.125s
user 0m1.060s
sys 0m0.201s

# wc -l result/www.who.int.ip.txt
9861 result/www.who.int_ip.txt

上述两个版本由于使用的是纯多进程dig方式,很难达到一个好的性能。因为单个进程资源占用太大,大量的进程又会频繁上下文切换。理想的方式是多进程加协程方式,这样就无法直接使用dig了,需要一个基于异步io的DNS解析实现。

golang发包版本

该版本直接构造并发送DNS查询请求包,并捕包分析DNS查询应答包。
最初我是使用python实现的该版本,但是性能实在太差,单机耗时在1分钟左右,然后使用golang重写了一遍。
代码:https://github.com/winway/domain2IP/blob/master/getIP.go
运行情况如下,单机耗时20秒。细心的读者会发现,这里解析出的IP比上述方案解析出的少,由于机器配置太低,存在些许丢包导致。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
# time go run getIP.go --url www.who.int
2018/06/14 16:56:25 Start to dig url: www.who.int
2018/06/14 16:56:25 Start capturePacket
2018/06/14 16:56:25 Load 23298 nameserver
2018/06/14 16:56:25 Start sendPacket
2018/06/14 16:56:25 Set filter: udp and src port 53
2018/06/14 16:56:39 Complete sendPacket
2018/06/14 16:56:40 Waiting ...
2018/06/14 16:56:41 Waiting ...
2018/06/14 16:56:44 Get 9204 ip for www.who.int

real 0m19.643s
user 0m1.269s
sys 0m0.469s

谢谢你请我吃糖果