HTTP T. Meunier Internet-Draft Cloudflare Intended status: Standards Track 29 May 2025 Expires: 30 November 2025 Signature-Agent for Robot Exclusion Protocol draft-meunier-signature-agent-rep-latest Abstract This document describes a new directive to allow Signature-Agent (Section 4 of [SIGNATURE-DIRECTORY]) directive in the Robot Exclusion Protocol ([RFC9309]). About This Document This note is to be removed before publishing as an RFC. The latest revision of this draft can be found at https://thibmeu.github.io/http-message-signatures-directory/draft- meunier-signature-agent-rep.html. Status information for this document may be found at https://datatracker.ietf.org/doc/draft- meunier-signature-agent-rep/. Discussion of this document takes place on the HTTP mailing list (mailto:ietf-http-wg@w3.org), which is archived at https://lists.w3.org/Archives/Public/ietf-http-wg/. Source for this draft and an issue tracker can be found at https://github.com/thibmeu/http-message-signatures-directory. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at https://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on 30 November 2025. Copyright Notice Copyright (c) 2025 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/ license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License. Table of Contents 1. Introduction 2. Conventions and Definitions 3. Specification 3.1. Protocol Definition 3.2. Formal Syntax 3.2.1. Signature-Agent line 4. Security Considerations 5. IANA Considerations 6. Normative References Appendix A. Examples A.1. Signature-Agent on example.com A.2. Signature-Agent with raw keyid Acknowledgments Author's Address 1. Introduction Bots are increasigly using Signature-Agent as a way to convey identity. As such, there is interest from Origins to define robot policy based on this header. This documents extends Robot Exclusion Protocol to support these, by defining a new group starting with signature-agent. 2. Conventions and Definitions The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here. 3. Specification 3.1. Protocol Definition Group One or more signature-agent lines that are followed by one or more rules. The group is terminated by a signature-agent line or end of file. See Section 3.2.1. The last group may have no rules, which means it implicitly allows everything. 3.2. Formal Syntax Based on the formal syntax defined in Section 2.2 of [RFC9309] startgroupline = user-agent-line / signature-agent-line ; a group can either be a user-agent, a signature-agent, or both. user-agent-line = *WS "user-agent" *WS ":" *WS product-token EOL signature-agent-line = *WS "signature-agent" *WS ":" *WS directory-token EOL directory-token = fqdn fqdn = ... ; domain as defined by signature-agent 3.2.1. Signature-Agent line Crawlers set their own identity, which is called a directory-agent, to find relevant groups. The directory token MUST contain only lowercase letters ("a-z"), underscores ("_"), hyphens ("-"), and dots ("."). The directory token SHOULD be a valid FQDN suffix of the identification string that the crawler sends to the service. For example, in the case of HTTP [RFC9110], the product token SHOULD be a substring in the Signature-Agent header. The identification string SHOULD describe the public cryptographic key material of the crawler. Here's an example of a Signature-Agent HTTP request header with a link pointing to a page describing the purpose of the crawler.example.com crawler, which appears as a suffix in the Signature-Agent HTTP header and as a directory token in the robots.txt directory-agent line +==========================================+==============================+ | Signature-Agent HTTP header | robots.txt signature-agent | | | line | +==========================================+==============================+ | Signature-Agent: crawler.example.com | signature-agent: example.com | +------------------------------------------+------------------------------+ 4. Security Considerations Signature-Agent group shares the security consideration of [RFC9309]. In addition, given Signature-Agent MAY present a domain name identifying crawlers public cryptographic key material, implementors should treat the content of signature-agent line as possibly sensitive. 5. IANA Considerations This document has no IANA actions. 6. Normative References [BOT-AUTH] Meunier, T., "HTTP Message Signatures for automated traffic Architecture", Work in Progress, Internet-Draft, draft-meunier-web-bot-auth-architecture-01, 7 May 2025, . [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, . [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 2017, . [RFC9110] Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, Ed., "HTTP Semantics", STD 97, RFC 9110, DOI 10.17487/RFC9110, June 2022, . [RFC9309] Koster, M., Illyes, G., Zeller, H., and L. Sassman, "Robots Exclusion Protocol", RFC 9309, DOI 10.17487/RFC9309, September 2022, . [SIGNATURE-DIRECTORY] Meunier, T., "HTTP Message Signatures Directory", Work in Progress, Internet-Draft, draft-meunier-http-message- signatures-directory-00, 15 April 2025, . Appendix A. Examples A.1. Signature-Agent on example.com Signature-Agent: example.com Allow: * A.2. Signature-Agent with raw keyid Not in the draft yet. We don't want to incentivise not rotating public keys. Signature-Agent: poqkLGiymh_W0uP6PZFw-dvez3QJT5SolqXBCW38r0U Disallow: /path/to/resource Acknowledgments TODO acknowledge. Author's Address Thibault Meunier Cloudflare Email: ot-ietf@thibault.uk