glibcのgetaddrinfo()問題 – CVE-2015-7547: glibc getaddrinfo stack-based buffer overflow

Computer, Security 2月 17, 2016
(Last Updated On: )

CVE-2015-7547: glibc getaddrinfo stack-based buffer overflow として2/16に公開されたセキュリティ問題に対するアクセスが多いようですが、古いglibcのページがヒットするようなので簡単にまとめておきます。コードレベルで詳しくは見ていません。詳しい事は他の方がまとめてくれると思います。この検索してこのブログに来てしまった方向けに簡単に紹介します。

GHOSTとは別の問題

RHEL6のglibcのパッチを見るとGHOSTとして知られるgethostbyaddr()の問題とは別の問題です。”参考”に記載したsourcewareのバグレポートからも別の問題だと分かります。

攻撃の方法

攻撃用のDNSサーバーを作ってgetaddrinfo()が脆弱なサーバーのプログラムにこの関数を呼び出させます。この関数はネットワーク関連のプログラムで呼び出されることが多く、ネットワークを扱うプログラムの大半で利用されていると思われます。

“CVE-2015-7547 仕組み”をキーワードとした検索も多いようなので追記します。脆弱性の仕組みは

の”High-level Analysis”と”Detailed Analysis”を参照してください。

攻撃が成功した場合の影響

2.9以降(2008年5月)のglibcに影響し、攻撃に成功すると攻撃対象のプログラムから任意コードが実行されます。

攻撃が失敗した場合の痕跡

攻撃されたプログラムがクラッシュします。不信なクラッシュを検出すれば攻撃されていることが分かります。DNSパケットを監視することでも攻撃は検出/防止できます。

この攻撃は”スタックオーバーフロー”としてGoogleのブログでは紹介されています。最終的には2KBのスタック変数のスタックオーバーフローで62KB程度のメモリが攻撃に利用できるようです。最近のプログラムはスタックオーバーフロー対策付きでコンパイルされているので、単純にオーバーフロー=攻撃成功とはなりません。また、スタック領域にコピーされるメモリ領域はヒープ領域なので攻撃に失敗した場合はASLRで保護されます。(ASLRによる保護は完全ではありません。Googleのブログでは書かれていないですがカナリア値によるスタックオーバーフロー保護も多くの環境で働くはずです)

緩和策/防御策

残念ながらgetaddrinfo()を利用するプログラム側での緩和策は仕組み的に無理だと思われ、利用するプログラムも多く現実的ではありません。防御策は以下の通りです。

  • glibcをアップデート(最良)
  • DNSで対応(ほぼOK。ただし単純に2KBを超えるDNSレスポンスを拒否すると問題となる可能性があるので、採用する場合は注意が必要)
  • SystemTapを使って防御(バグレポートで紹介されていますが、効果/副作用などは判りません)

AmazonのAWSで利用するデフォルトのDNSは対応済みらしく、設定を変えていないユーザーは大丈夫とAmazonがアナウンスしています。

メジャーなディストリビューションは修正版glibcを公開しています。rpm系なら

yum update glibc または dnf update glibc

で更新できるはずです。

 

対策にならない対策

以下のDNSサーバー設定、システム設定では対策になりません。

  • options single-requestを設定する
  • options single-request-reopenを設定する
  • IPv6を無効にする

 

まとめ

攻撃できないPoC、とはいえPoCが公開されており、バグの内容もかなり詳細に解説されており、パッチも当然公開されています。リスクは高いと思われます。早急に対策する必要があると思います。

 

参考

Main conclusions:

- Via getaddrinfo with family AF_UNSPEC or AF_INET6 the overflowed
  buffer is located on the stack via alloca (a 2048 byte fixed size
  buffer for DNS responses).

- At most 65535 bytes (MAX_PACKET) may be written to the alloca buffer
  of 2048 bytes. Overflowing bytes are entirely under the control of the
  attacker and are the result of a crafted DNS response.

- Local testing shows that we have been able to control at least the
  execution of one free() call with the buffer overflow and gained
  control of EIP. Further exploitation was not attempted, only this
  single attempt to show that it is very likely that execution control
  can be gained without much more effort. We know of no known attacks
  that use this specific vulnerability.

- Mitigating factors for UDP include:
  - A firewall that drops UDP DNS packets > 512 bytes.
  - A local resolver (that drops non-compliant responses).
  - Avoid dual A and AAAA queries (avoids buffer management error) e.g.
    Do not use AF_UNSPEC.
  - No use of `options edns0` in /etc/resolv.conf since EDNS0 allows
    responses larger than 512 bytes and can lead to valid DNS responses
    that overflow.
  - No use of `RES_USE_EDNS0` or `RES_USE_DNSSEC` since they can both
    lead to valid large EDNS0-based DNS responses that can overflow.

- Mitigating factors for TCP include:
  - Limit all replies to 1024 bytes.

- Mitigations that don't work:
  - Setting `options single-request` does not change buffer management
    and does not prevent the exploit.
  - Setting `options single-request-reopen` does not change buffer
    management and does not prevent the exploit.
  - Disabling IPv6 does not disable AAAA queries. The use of AF_UNSPEC
    unconditionally enables the dual query.
    - The use of `sysctl -w net.ipv6.conf.all.disable_ipv6=1` will not
      protect your system from the exploit.
  - Blocking IPv6 at a local or intermediate resolver does not work to
    prevent the exploit. The exploit payload can be delivered in A or
    AAAA results, it is the parallel query that triggers the buffer
    management flaw.

- The code that causes the vulnerability was introduced in May 2008 as
  part of glibc 2.9.

- The code that causes the vulnerability is only present in glibc's copy
  of libresolv which has enhancements to carry out parallel A and AAAA
  queries. Therefore only programs using glibc's copy of the code have
  this problem.

- A back of the envelope analysis shows that it should be possible to
  write correctly formed DNS responses with attacker controlled payloads
  that will penetrate a DNS cache hierarchy and therefore allow
  attackers to exploit machines behind such caches.

これを見るとごく単純なスタックオーバーフローではなく、メモリ割り当ての問題などで最終的にスタックオーバーフローになることが解ります。

  • CentOS 6のglibcのパッチ
Index: glibc-2.12-2-gc4ccff1/resolv/nss_dns/dns-host.c
===================================================================
--- glibc-2.12-2-gc4ccff1.orig/resolv/nss_dns/dns-host.c
+++ glibc-2.12-2-gc4ccff1/resolv/nss_dns/dns-host.c
@@ -1043,7 +1043,10 @@ gaih_getanswer_slice (const querybuf *an
   int h_namelen = 0;
 
   if (ancount == 0)
-    return NSS_STATUS_NOTFOUND;
+    {
+      *h_errnop = HOST_NOT_FOUND;
+      return NSS_STATUS_NOTFOUND;
+    }
 
   while (ancount-- > 0 && cp < end_of_message && had_error == 0)
     {
@@ -1217,7 +1220,14 @@ gaih_getanswer_slice (const querybuf *an
   /* Special case here: if the resolver sent a result but it only
      contains a CNAME while we are looking for a T_A or T_AAAA record,
      we fail with NOTFOUND instead of TRYAGAIN.  */
-  return canon == NULL ? NSS_STATUS_TRYAGAIN : NSS_STATUS_NOTFOUND;
+  if (canon != NULL)
+    {
+      *h_errnop = HOST_NOT_FOUND;
+      return NSS_STATUS_NOTFOUND;
+    }
+
+  *h_errnop = NETDB_INTERNAL;
+  return NSS_STATUS_TRYAGAIN;
 }
 
 
@@ -1231,11 +1241,101 @@ gaih_getanswer (const querybuf *answer1,
 
   enum nss_status status = NSS_STATUS_NOTFOUND;
 
+  /* Combining the NSS status of two distinct queries requires some
+     compromise and attention to symmetry (A or AAAA queries can be
+     returned in any order).  What follows is a breakdown of how this
+     code is expected to work and why. We discuss only SUCCESS,
+     TRYAGAIN, NOTFOUND and UNAVAIL, since they are the only returns
+     that apply (though RETURN and MERGE exist).  We make a distinction
+     between TRYAGAIN (recoverable) and TRYAGAIN' (not-recoverable).
+     A recoverable TRYAGAIN is almost always due to buffer size issues
+     and returns ERANGE in errno and the caller is expected to retry
+     with a larger buffer.
+
+     Lastly, you may be tempted to make significant changes to the
+     conditions in this code to bring about symmetry between responses.
+     Please don't change anything without due consideration for
+     expected application behaviour.  Some of the synthesized responses
+     aren't very well thought out and sometimes appear to imply that
+     IPv4 responses are always answer 1, and IPv6 responses are always
+     answer 2, but that's not true (see the implemetnation of send_dg
+     and send_vc to see response can arrive in any order, particlarly
+     for UDP). However, we expect it holds roughly enough of the time
+     that this code works, but certainly needs to be fixed to make this
+     a more robust implementation.
+
+     ----------------------------------------------
+     | Answer 1 Status /   | Synthesized | Reason |
+     | Answer 2 Status     | Status      |        |
+     |--------------------------------------------|
+     | SUCCESS/SUCCESS     | SUCCESS     | [1]    |
+     | SUCCESS/TRYAGAIN    | TRYAGAIN    | [5]    |
+     | SUCCESS/TRYAGAIN'   | SUCCESS     | [1]    |
+     | SUCCESS/NOTFOUND    | SUCCESS     | [1]    |
+     | SUCCESS/UNAVAIL     | SUCCESS     | [1]    |
+     | TRYAGAIN/SUCCESS    | TRYAGAIN    | [2]    |
+     | TRYAGAIN/TRYAGAIN   | TRYAGAIN    | [2]    |
+     | TRYAGAIN/TRYAGAIN'  | TRYAGAIN    | [2]    |
+     | TRYAGAIN/NOTFOUND   | TRYAGAIN    | [2]    |
+     | TRYAGAIN/UNAVAIL    | TRYAGAIN    | [2]    |
+     | TRYAGAIN'/SUCCESS   | SUCCESS     | [3]    |
+     | TRYAGAIN'/TRYAGAIN  | TRYAGAIN    | [3]    |
+     | TRYAGAIN'/TRYAGAIN' | TRYAGAIN'   | [3]    |
+     | TRYAGAIN'/NOTFOUND  | TRYAGAIN'   | [3]    |
+     | TRYAGAIN'/UNAVAIL   | UNAVAIL     | [3]    |
+     | NOTFOUND/SUCCESS    | SUCCESS     | [3]    |
+     | NOTFOUND/TRYAGAIN   | TRYAGAIN    | [3]    |
+     | NOTFOUND/TRYAGAIN'  | TRYAGAIN'   | [3]    |
+     | NOTFOUND/NOTFOUND   | NOTFOUND    | [3]    |
+     | NOTFOUND/UNAVAIL    | UNAVAIL     | [3]    |
+     | UNAVAIL/SUCCESS     | UNAVAIL     | [4]    |
+     | UNAVAIL/TRYAGAIN    | UNAVAIL     | [4]    |
+     | UNAVAIL/TRYAGAIN'   | UNAVAIL     | [4]    |
+     | UNAVAIL/NOTFOUND    | UNAVAIL     | [4]    |
+     | UNAVAIL/UNAVAIL     | UNAVAIL     | [4]    |
+     ----------------------------------------------
+
+     [1] If the first response is a success we return success.
+         This ignores the state of the second answer and in fact
+         incorrectly sets errno and h_errno to that of the second
+	 answer.  However because the response is a success we ignore
+	 *errnop and *h_errnop (though that means you touched errno on
+         success).  We are being conservative here and returning the
+         likely IPv4 response in the first answer as a success.
+
+     [2] If the first response is a recoverable TRYAGAIN we return
+	 that instead of looking at the second response.  The
+	 expectation here is that we have failed to get an IPv4 response
+	 and should retry both queries.
+
+     [3] If the first response was not a SUCCESS and the second
+	 response is not NOTFOUND (had a SUCCESS, need to TRYAGAIN,
+	 or failed entirely e.g. TRYAGAIN' and UNAVAIL) then use the
+	 result from the second response, otherwise the first responses
+	 status is used.  Again we have some odd side-effects when the
+	 second response is NOTFOUND because we overwrite *errnop and
+	 *h_errnop that means that a first answer of NOTFOUND might see
+	 its *errnop and *h_errnop values altered.  Whether it matters
+	 in practice that a first response NOTFOUND has the wrong
+	 *errnop and *h_errnop is undecided.
+
+     [4] If the first response is UNAVAIL we return that instead of
+	 looking at the second response.  The expectation here is that
+	 it will have failed similarly e.g. configuration failure.
+
+     [5] Testing this code is complicated by the fact that truncated
+	 second response buffers might be returned as SUCCESS if the
+	 first answer is a SUCCESS.  To fix this we add symmetry to
+	 TRYAGAIN with the second response.  If the second response
+	 is a recoverable error we now return TRYAGIN even if the first
+	 response was SUCCESS.  */
+
   if (anslen1 > 0)
     status = gaih_getanswer_slice(answer1, anslen1, qname,
 				  &pat, &buffer, &buflen,
 				  errnop, h_errnop, ttlp,
 				  &first);
+
   if ((status == NSS_STATUS_SUCCESS || status == NSS_STATUS_NOTFOUND
        || (status == NSS_STATUS_TRYAGAIN
 	   && (*errnop != ERANGE || *h_errnop == NO_RECOVERY)))
@@ -1245,8 +1345,15 @@ gaih_getanswer (const querybuf *answer1,
 						     &pat, &buffer, &buflen,
 						     errnop, h_errnop, ttlp,
 						     &first);
+      /* Use the second response status in some cases.  */
       if (status != NSS_STATUS_SUCCESS && status2 != NSS_STATUS_NOTFOUND)
 	status = status2;
+      /* Do not return a truncated second response (unless it was
+         unavoidable e.g. unrecoverable TRYAGAIN).  */
+      if (status == NSS_STATUS_SUCCESS
+	  && (status2 == NSS_STATUS_TRYAGAIN
+	      && *errnop == ERANGE && *h_errnop != NO_RECOVERY))
+	status = NSS_STATUS_TRYAGAIN;
     }
 
   return status;
Index: glibc-2.12-2-gc4ccff1/resolv/res_send.c
===================================================================
--- glibc-2.12-2-gc4ccff1.orig/resolv/res_send.c
+++ glibc-2.12-2-gc4ccff1/resolv/res_send.c
@@ -1,3 +1,20 @@
+/* Copyright (C) 2016 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
 /*
  * Copyright (c) 1985, 1989, 1993
  *    The Regents of the University of California.  All rights reserved.
@@ -360,6 +377,8 @@ __libc_res_nsend(res_state statp, const
 #ifdef USE_HOOKS
 	if (__builtin_expect (statp->qhook || statp->rhook, 0)) {
 		if (anssiz < MAXPACKET && ansp) {
+			/* Always allocate MAXPACKET, callers expect
+			   this specific size.  */
 			u_char *buf = malloc (MAXPACKET);
 			if (buf == NULL)
 				return (-1);
@@ -652,6 +671,77 @@ libresolv_hidden_def (res_nsend)
 
 /* Private */
 
+/* The send_vc function is responsible for sending a DNS query over TCP
+   to the nameserver numbered NS from the res_state STATP i.e.
+   EXT(statp).nssocks[ns].  The function supports sending both IPv4 and
+   IPv6 queries at the same serially on the same socket.
+
+   Please note that for TCP there is no way to disable sending both
+   queries, unlike UDP, which honours RES_SNGLKUP and RES_SNGLKUPREOP
+   and sends the queries serially and waits for the result after each
+   sent query.  This implemetnation should be corrected to honour these
+   options.
+
+   Please also note that for TCP we send both queries over the same
+   socket one after another.  This technically violates best practice
+   since the server is allowed to read the first query, respond, and
+   then close the socket (to service another client).  If the server
+   does this, then the remaining second query in the socket data buffer
+   will cause the server to send the client an RST which will arrive
+   asynchronously and the client's OS will likely tear down the socket
+   receive buffer resulting in a potentially short read and lost
+   response data.  This will force the client to retry the query again,
+   and this process may repeat until all servers and connection resets
+   are exhausted and then the query will fail.  It's not known if this
+   happens with any frequency in real DNS server implementations.  This
+   implementation should be corrected to use two sockets by default for
+   parallel queries.
+
+   The query stored in BUF of BUFLEN length is sent first followed by
+   the query stored in BUF2 of BUFLEN2 length.  Queries are sent
+   serially on the same socket.
+
+   Answers to the query are stored firstly in *ANSP up to a max of
+   *ANSSIZP bytes.  If more than *ANSSIZP bytes are needed and ANSCP
+   is non-NULL (to indicate that modifying the answer buffer is allowed)
+   then malloc is used to allocate a new response buffer and ANSCP and
+   ANSP will both point to the new buffer.  If more than *ANSSIZP bytes
+   are needed but ANSCP is NULL, then as much of the response as
+   possible is read into the buffer, but the results will be truncated.
+   When truncation happens because of a small answer buffer the DNS
+   packets header feild TC will bet set to 1, indicating a truncated
+   message and the rest of the socket data will be read and discarded.
+
+   Answers to the query are stored secondly in *ANSP2 up to a max of
+   *ANSSIZP2 bytes, with the actual response length stored in
+   *RESPLEN2.  If more than *ANSSIZP bytes are needed and ANSP2
+   is non-NULL (required for a second query) then malloc is used to
+   allocate a new response buffer, *ANSSIZP2 is set to the new buffer
+   size and *ANSP2_MALLOCED is set to 1.
+
+   The ANSP2_MALLOCED argument will eventually be removed as the
+   change in buffer pointer can be used to detect the buffer has
+   changed and that the caller should use free on the new buffer.
+
+   Note that the answers may arrive in any order from the server and
+   therefore the first and second answer buffers may not correspond to
+   the first and second queries.
+
+   It is not supported to call this function with a non-NULL ANSP2
+   but a NULL ANSCP.  Put another way, you can call send_vc with a
+   single unmodifiable buffer or two modifiable buffers, but no other
+   combination is supported.
+
+   It is the caller's responsibility to free the malloc allocated
+   buffers by detecting that the pointers have changed from their
+   original values i.e. *ANSCP or *ANSP2 has changed.
+
+   If errors are encountered then *TERRNO is set to an appropriate
+   errno value and a zero result is returned for a recoverable error,
+   and a less-than zero result is returned for a non-recoverable error.
+
+   If no errors are encountered then *TERRNO is left unmodified and
+   a the length of the first response in bytes is returned.  */
 static int
 send_vc(res_state statp,
 	const u_char *buf, int buflen, const u_char *buf2, int buflen2,
@@ -661,11 +751,7 @@ send_vc(res_state statp,
 {
 	const HEADER *hp = (HEADER *) buf;
 	const HEADER *hp2 = (HEADER *) buf2;
-	u_char *ans = *ansp;
-	int orig_anssizp = *anssizp;
-	// XXX REMOVE
-	// int anssiz = *anssizp;
-	HEADER *anhp = (HEADER *) ans;
+	HEADER *anhp = (HEADER *) *ansp;
 	struct sockaddr_in6 *nsap = EXT(statp).nsaddrs[ns];
 	int truncating, connreset, resplen, n;
 	struct iovec iov[4];
@@ -741,6 +827,8 @@ send_vc(res_state statp,
 	 * Receive length & response
 	 */
 	int recvresp1 = 0;
+	/* Skip the second response if there is no second query.
+           To do that we mark the second response as received.  */
 	int recvresp2 = buf2 == NULL;
 	uint16_t rlen16;
  read_len:
@@ -777,33 +865,14 @@ send_vc(res_state statp,
 	u_char **thisansp;
 	int *thisresplenp;
 	if ((recvresp1 | recvresp2) == 0 || buf2 == NULL) {
+		/* We have not received any responses
+		   yet or we only have one response to
+		   receive.  */
 		thisanssizp = anssizp;
 		thisansp = anscp ?: ansp;
 		assert (anscp != NULL || ansp2 == NULL);
 		thisresplenp = &resplen;
 	} else {
-		if (*anssizp != MAXPACKET) {
-			/* No buffer allocated for the first
-			   reply.  We can try to use the rest
-			   of the user-provided buffer.  */
-#ifdef _STRING_ARCH_unaligned
-			*anssizp2 = orig_anssizp - resplen;
-			*ansp2 = *ansp + resplen;
-#else
-			int aligned_resplen
-			  = ((resplen + __alignof__ (HEADER) - 1)
-			     & ~(__alignof__ (HEADER) - 1));
-			*anssizp2 = orig_anssizp - aligned_resplen;
-			*ansp2 = *ansp + aligned_resplen;
-#endif
-		} else {
-			/* The first reply did not fit into the
-			   user-provided buffer.  Maybe the second
-			   answer will.  */
-			*anssizp2 = orig_anssizp;
-			*ansp2 = *ansp;
-		}
-
 		thisanssizp = anssizp2;
 		thisansp = ansp2;
 		thisresplenp = resplen2;
@@ -811,10 +880,14 @@ send_vc(res_state statp,
 	anhp = (HEADER *) *thisansp;
 
 	*thisresplenp = rlen;
-	if (rlen > *thisanssizp) {
-		/* Yes, we test ANSCP here.  If we have two buffers
-		   both will be allocatable.  */
-		if (__builtin_expect (anscp != NULL, 1)) {
+	/* Is the answer buffer too small?  */
+	if (*thisanssizp < rlen) {
+		/* If the current buffer is non-NULL and it's not
+		   pointing at the static user-supplied buffer then
+		   we can reallocate it.  */
+		if (thisansp != NULL && thisansp != ansp) {
+			/* Always allocate MAXPACKET, callers expect
+			   this specific size.  */
 			u_char *newp = malloc (MAXPACKET);
 			if (newp == NULL) {
 				*terrno = ENOMEM;
@@ -824,6 +897,9 @@ send_vc(res_state statp,
 			*thisanssizp = MAXPACKET;
 			*thisansp = newp;
 			anhp = (HEADER *) newp;
+			/* A uint16_t can't be larger than MAXPACKET
+			   thus it's safe to allocate MAXPACKET but
+			   read RLEN bytes instead.  */
 			len = rlen;
 		} else {
 			Dprint(statp->options & RES_DEBUG,
@@ -987,6 +1063,66 @@ reopen (res_state statp, int *terrno, in
 	return 1;
 }
 
+/* The send_dg function is responsible for sending a DNS query over UDP
+   to the nameserver numbered NS from the res_state STATP i.e.
+   EXT(statp).nssocks[ns].  The function supports IPv4 and IPv6 queries
+   along with the ability to send the query in parallel for both stacks
+   (default) or serially (RES_SINGLKUP).  It also supports serial lookup
+   with a close and reopen of the socket used to talk to the server
+   (RES_SNGLKUPREOP) to work around broken name servers.
+
+   The query stored in BUF of BUFLEN length is sent first followed by
+   the query stored in BUF2 of BUFLEN2 length.  Queries are sent
+   in parallel (default) or serially (RES_SINGLKUP or RES_SNGLKUPREOP).
+
+   Answers to the query are stored firstly in *ANSP up to a max of
+   *ANSSIZP bytes.  If more than *ANSSIZP bytes are needed and ANSCP
+   is non-NULL (to indicate that modifying the answer buffer is allowed)
+   then malloc is used to allocate a new response buffer and ANSCP and
+   ANSP will both point to the new buffer.  If more than *ANSSIZP bytes
+   are needed but ANSCP is NULL, then as much of the response as
+   possible is read into the buffer, but the results will be truncated.
+   When truncation happens because of a small answer buffer the DNS
+   packets header feild TC will bet set to 1, indicating a truncated
+   message, while the rest of the UDP packet is discarded.
+
+   Answers to the query are stored secondly in *ANSP2 up to a max of
+   *ANSSIZP2 bytes, with the actual response length stored in
+   *RESPLEN2.  If more than *ANSSIZP bytes are needed and ANSP2
+   is non-NULL (required for a second query) then malloc is used to
+   allocate a new response buffer, *ANSSIZP2 is set to the new buffer
+   size and *ANSP2_MALLOCED is set to 1.
+
+   The ANSP2_MALLOCED argument will eventually be removed as the
+   change in buffer pointer can be used to detect the buffer has
+   changed and that the caller should use free on the new buffer.
+
+   Note that the answers may arrive in any order from the server and
+   therefore the first and second answer buffers may not correspond to
+   the first and second queries.
+
+   It is not supported to call this function with a non-NULL ANSP2
+   but a NULL ANSCP.  Put another way, you can call send_vc with a
+   single unmodifiable buffer or two modifiable buffers, but no other
+   combination is supported.
+
+   It is the caller's responsibility to free the malloc allocated
+   buffers by detecting that the pointers have changed from their
+   original values i.e. *ANSCP or *ANSP2 has changed.
+
+   If an answer is truncated because of UDP datagram DNS limits then
+   *V_CIRCUIT is set to 1 and the return value non-zero to indicate to
+   the caller to retry with TCP.  The value *GOTSOMEWHERE is set to 1
+   if any progress was made reading a response from the nameserver and
+   is used by the caller to distinguish between ECONNREFUSED and
+   ETIMEDOUT (the latter if *GOTSOMEWHERE is 1).
+
+   If errors are encountered then *TERRNO is set to an appropriate
+   errno value and a zero result is returned for a recoverable error,
+   and a less-than zero result is returned for a non-recoverable error.
+
+   If no errors are encountered then *TERRNO is left unmodified and
+   a the length of the first response in bytes is returned.  */
 static int
 send_dg(res_state statp,
 	const u_char *buf, int buflen, const u_char *buf2, int buflen2,
@@ -996,8 +1132,6 @@ send_dg(res_state statp,
 {
 	const HEADER *hp = (HEADER *) buf;
 	const HEADER *hp2 = (HEADER *) buf2;
-	u_char *ans = *ansp;
-	int orig_anssizp = *anssizp;
 	struct timespec now, timeout, finish;
 	struct pollfd pfd[1];
 	int ptimeout;
@@ -1029,6 +1163,8 @@ send_dg(res_state statp,
 	int need_recompute = 0;
 	int nwritten = 0;
 	int recvresp1 = 0;
+	/* Skip the second response if there is no second query.
+           To do that we mark the second response as received.  */
 	int recvresp2 = buf2 == NULL;
 	pfd[0].fd = EXT(statp).nssocks[ns];
 	pfd[0].events = POLLOUT;
@@ -1125,50 +1261,52 @@ send_dg(res_state statp,
 		int *thisresplenp;
 
 		if ((recvresp1 | recvresp2) == 0 || buf2 == NULL) {
+			/* We have not received any responses
+			   yet or we only have one response to
+			   receive.  */
 			thisanssizp = anssizp;
 			thisansp = anscp ?: ansp;
 			assert (anscp != NULL || ansp2 == NULL);
 			thisresplenp = &resplen;
 		} else {
-			if (*anssizp != MAXPACKET) {
-				/* No buffer allocated for the first
-				   reply.  We can try to use the rest
-				   of the user-provided buffer.  */
-#ifdef _STRING_ARCH_unaligned
-				*anssizp2 = orig_anssizp - resplen;
-				*ansp2 = *ansp + resplen;
-#else
-				int aligned_resplen
-				  = ((resplen + __alignof__ (HEADER) - 1)
-				     & ~(__alignof__ (HEADER) - 1));
-				*anssizp2 = orig_anssizp - aligned_resplen;
-				*ansp2 = *ansp + aligned_resplen;
-#endif
-			} else {
-				/* The first reply did not fit into the
-				   user-provided buffer.  Maybe the second
-				   answer will.  */
-				*anssizp2 = orig_anssizp;
-				*ansp2 = *ansp;
-			}
-
 			thisanssizp = anssizp2;
 			thisansp = ansp2;
 			thisresplenp = resplen2;
 		}
 
 		if (*thisanssizp < MAXPACKET
-		    /* Yes, we test ANSCP here.  If we have two buffers
-		       both will be allocatable.  */
-		    && anscp
+		    /* If the current buffer is non-NULL and it's not
+		       pointing at the static user-supplied buffer then
+		       we can reallocate it.  */
+		    && (thisansp != NULL && thisansp != ansp)
+		    /* Is the size too small?  */
 		    && (ioctl (pfd[0].fd, FIONREAD, thisresplenp) < 0
-			|| *thisanssizp < *thisresplenp)) {
+			|| *thisanssizp < *thisresplenp)
+		    ) {
+			/* Always allocate MAXPACKET, callers expect
+			   this specific size.  */
 			u_char *newp = malloc (MAXPACKET);
 			if (newp != NULL) {
-				*anssizp = MAXPACKET;
-				*thisansp = ans = newp;
+				*thisanssizp = MAXPACKET;
+				*thisansp = newp;
 			}
 		}
+		/* We could end up with truncation if anscp was NULL
+		   (not allowed to change caller's buffer) and the
+		   response buffer size is too small.  This isn't a
+		   reliable way to detect truncation because the ioctl
+		   may be an inaccurate report of the UDP message size.
+		   Therefore we use this only to issue debug output.
+		   To do truncation accurately with UDP we need
+		   MSG_TRUNC which is only available on Linux.  We
+		   can abstract out the Linux-specific feature in the
+		   future to detect truncation.  */
+		if (__glibc_unlikely (*thisanssizp < *thisresplenp)) {
+			Dprint(statp->options & RES_DEBUG,
+			       (stdout, ";; response may be truncated (UDP)\n")
+			);
+		}
+
 		HEADER *anhp = (HEADER *) *thisansp;
 		socklen_t fromlen = sizeof(struct sockaddr_in6);
 		assert (sizeof(from) <= fromlen);
Index: glibc-2.12-2-gc4ccff1/resolv/res_query.c
===================================================================
--- glibc-2.12-2-gc4ccff1.orig/resolv/res_query.c
+++ glibc-2.12-2-gc4ccff1/resolv/res_query.c
@@ -391,6 +391,7 @@ __libc_res_nsearch(res_state statp,
 		    && (*answerp2 < answer || *answerp2 >= answer + anslen))
 		  {
 		    free (*answerp2);
+		    *nanswerp2 = 0;
 		    *answerp2 = NULL;
 		  }
 	}
@@ -431,6 +432,7 @@ __libc_res_nsearch(res_state statp,
 				|| *answerp2 >= answer + anslen))
 			  {
 			    free (*answerp2);
+			    *nanswerp2 = 0;
 			    *answerp2 = NULL;
 			  }
 
@@ -503,6 +505,7 @@ __libc_res_nsearch(res_state statp,
 	if (answerp2 && (*answerp2 < answer || *answerp2 >= answer + anslen))
 	  {
 	    free (*answerp2);
+	    *nanswerp2 = NULL;
 	    *answerp2 = NULL;
 	  }
 	if (saved_herrno != -1)

 

commit 2c1094bd700e63a8d7f547b3f5495bedb55c0a08
Author: Ulrich Drepper <drepper@gmail.com>
Date:   Thu Dec 22 22:43:39 2011 -0500

    Create internal threads with sufficient stack size

Index: glibc-2.12-2-gc4ccff1/nptl/Versions
===================================================================
--- glibc-2.12-2-gc4ccff1.orig/nptl/Versions
+++ glibc-2.12-2-gc4ccff1/nptl/Versions
@@ -255,6 +255,6 @@ libpthread {
   GLIBC_PRIVATE {
     __pthread_initialize_minimal;
     __pthread_clock_gettime; __pthread_clock_settime;
-    __pthread_unwind;
+    __pthread_unwind; __pthread_get_minstack;
   }
 }
Index: glibc-2.12-2-gc4ccff1/nptl/nptl-init.c
===================================================================
--- glibc-2.12-2-gc4ccff1.orig/nptl/nptl-init.c
+++ glibc-2.12-2-gc4ccff1/nptl/nptl-init.c
@@ -507,3 +507,13 @@ __pthread_initialize_minimal_internal (i
 }
 strong_alias (__pthread_initialize_minimal_internal,
 	      __pthread_initialize_minimal)
+
+
+size_t
+__pthread_get_minstack (const pthread_attr_t *attr)
+{
+  struct pthread_attr *iattr = (struct pthread_attr *) attr;
+
+  return (GLRO(dl_pagesize) + __static_tls_size + PTHREAD_STACK_MIN
+	  + iattr->guardsize);
+}
Index: glibc-2.12-2-gc4ccff1/nptl/pthreadP.h
===================================================================
--- glibc-2.12-2-gc4ccff1.orig/nptl/pthreadP.h
+++ glibc-2.12-2-gc4ccff1/nptl/pthreadP.h
@@ -397,6 +397,7 @@ weak_function;
 
 extern void __pthread_init_static_tls (struct link_map *) attribute_hidden;
 
+extern size_t __pthread_get_minstack (const pthread_attr_t *attr);
 
 /* Namespace save aliases.  */
 extern int __pthread_getschedparam (pthread_t thread_id, int *policy,
Index: glibc-2.12-2-gc4ccff1/nptl/sysdeps/unix/sysv/linux/timer_routines.c
===================================================================
--- glibc-2.12-2-gc4ccff1.orig/nptl/sysdeps/unix/sysv/linux/timer_routines.c
+++ glibc-2.12-2-gc4ccff1/nptl/sysdeps/unix/sysv/linux/timer_routines.c
@@ -165,7 +165,7 @@ __start_helper_thread (void)
      and should go away automatically when canceled.  */
   pthread_attr_t attr;
   (void) pthread_attr_init (&attr);
-  (void) pthread_attr_setstacksize (&attr, PTHREAD_STACK_MIN);
+  (void) pthread_attr_setstacksize (&attr, __pthread_get_minstack (&attr));
 
   /* Block all signals in the helper thread but SIGSETXID.  To do this
      thoroughly we temporarily have to block all signals here.  The

 

投稿者: yohgaki