Network Working Group | M. Vavrusa |
Internet-Draft | CZ.NIC |
Intended status: Standards Track | March 23, 2015 |
Expires: September 24, 2015 |
DNS Message Compression
draft-vavrusa-dnscompr-00
This document describes an application of data compression algorithms on DNS messages with the goal of reducing message size of frequent responses. The client proposes compression protocol which the server may use to compress an arbitrary part of the response.
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."
This Internet-Draft will expire on September 24, 2015.
Copyright (c) 2015 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].
The reader is assumed to be familiar with the basic DNS and DNSSEC concepts described in [RFC1034], [RFC1035] and [RFC6891]. The terms "compression", "decompression" are used as in [RFC1951].
The domain name compression was introduced in Section 4.1.4 of [RFC1035] with the intent of reducing message length by avoiding repetitive sequences of domain name labels. This has proven useful especially for the the UDP-carried messages, so the Section 6.1.2.4 of [RFC1123] mandated the use of compression in responses.
A domain name is represented by a sequence of labels, where the first octet denotes the label length, excluding itself. Each domain name is required to be terminated by a zero-length label representing the root domain name. [RFC1035] declares that a label length MUST be 63 bytes or less. This requires the least significant 6 bits from the first octet for the label length, and leaves the 2 most significant bits reserved for second meaning.
If both most significant bits have a value of '1', the following 14 bits represent a compression pointer, which denotes a position in the message where the next label continues. This position may also be a compression pointer, as it points backwards and the final name doesn't exceed the size limits defined in section 2.3.4 of [RFC1035]. The method implies only the repetitive domain name labels may be compressed.
Later, Section 5 of [RFC6891] defined an extended label type, where the most significant two bits have a value of '01', and the remaining 6 bits are used for extended label type. Section 3 of [RFC3363] has shown that the extended label types are rejected as malformed by unaware DNS implementations.
The proposed compression method introduces an extended label type to indicate that the remainder of the message is compressed, and an OPT RR option COMPRESS to negotiate compression support. To ensure compatibility with existing infrastructure, the new label type MUST NOT be used in a DNS query, and it MAY be used in DNS response only after the support is indicated by the presence of the COMPRESS option in the query.
The client proposes a compression algorithm via the COMPRESS OPT option, this mandates that the both client and the server support [RFC6891] EDNS. If the server doesn't support EDNS, no OPT RR is returned in the response and no compression occurs. A COMPRESS-aware server MAY place a compression indicator at any start of the label in the message, followed by the compressed remainder of the message. The server SHOULD use the client-proposed algorithm if it supports it, but it MAY use the mandatory algorithm as well.
If a client recognizes compression indicator, it decompresses the remainder of the message in its place. If the server uses mandatory algorithm instead of negotiated, client SHOULD assume that the server doesn't support the negotiated algorithm, and SHOULD try different algorithm next time. If the response isn't compressed, the client MUST NOT presume that the server doesn't support it.
COMPRESS is an OPT RR [RFC6891] option, that can be included once in the RDATA of an OPT RR in DNS messages.
The option is encoded in 5 bytes as shown below.
1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | OPTION-CODE TBD | OPTION-LENGTH = 1 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Algorithm | +-+-+-+-+-+-+-+-+
Figure 1: OPT RR option COMPRESS format.
The Algorithm field defines the compression algorithm proposed by the client, denoted by an assigned number. There are 256 possible combinations, including the mandatory compression algorithm. In the case of exhaustion, a new OPT code should be proposed. A value of zero indicates the mandatory compression algorithm.
The COMPRESS option MAY be used in the query to indicate compression request and negotiated algorithm. The use of COMPRESS option in the response is not defined.
@TODO@
a. DEFLATE, code 0x00, MANDATORY
A lossless compressed data format that compresses data using a combination of the LZ77 algorithm and Huffman coding. As specified in the [RFC1951], the format can be implemented readily in a manner not covered by patents.
b. LZ4, code 0x01 OPTIONAL
LZ4 is a lossless data compression algorithm that is focused on compression and decompression speed. BSD licensed implementation written by Yann Collet, no RFC available to date.
@TODO@ Needs further research, the compression algorithms are mine field of patents.
This document proposes an alternative remainder compression indicator:
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | 0 1| 0 0 0 0 0 1| ALGORITHM | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
Figure 2: Compression indicator extended label type format.
The "01" denotes the extended label type, as defined in Section 5 of [RFC6891]. The remaining part of the first octet "000001" defines a remainder compression indicator. The next octet represent the used compression algorithm.
Note - a decompressed remainder of the message MUST NOT contain a compression indicator, thus a message can only be compressed once.
@TODO@
@TODO@
@TODO@
The proposed label format may not be correctly processed by existing software, so the following considerations must be taken into account:
Applications intercepting response messages may reject the message as malformed, but there is no legitimate application for tampering with responses known to the author.
Client MUST follow fallback procedure as in Section 6.2.2 of [RFC6891].
@TODO@
@REMARK@ Depends on algorithm and implementation, may be faster because of bandwidth savings, may be slower because of extra overhead.
@REMARK@ LZ4 for example shows over 480MB/s compression speed on a single core, this is almost equal to 1M 512B packets/sec per core.
@REMARK@ Response using this draft may not use label compression.
@REMARK@ TODO: measurements on performance (nr. cycles per compressed/uncompressed response)
The compression library implementing used compression algorithm is a liability. Corruption of the compressed data is likely to be more severe than for the uncompressed data, the DNS implementation MUST parse the message after decompression, as it would with an uncompressed message, even though the decompression algorithm may detect a corruption.
@TODO@
[RFC1034] | Mockapetris, P., "Domain names - concepts and facilities", STD 13, RFC 1034, November 1987. |
[RFC1035] | Mockapetris, P., "Domain names - implementation and specification", STD 13, RFC 1035, November 1987. |
[RFC6891] | Damas, J., Graff, M. and P. Vixie, "Extension Mechanisms for DNS (EDNS(0))", STD 75, RFC 6891, April 2013. |
[RFC1123] | Braden, R., "Requirements for Internet Hosts - Application and Support", STD 3, RFC 1123, October 1989. |
[RFC1951] | Deutsch, P., "DEFLATE Compressed Data Format Specification version 1.3", RFC 1951, May 1996. |
[RFC2119] | Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. |
[RFC3363] | Bush, R., Durand, A., Fink, B., Gudmundsson, O. and T. Hain, "Representing Internet Protocol version 6 (IPv6) Addresses in the Domain Name System (DNS)", RFC 3363, August 2002. |
@TODO@ @REMARK@ See https://github.com/vavrusa/rfc-dnscomp/tree/master/data