HW 1 COMP 431 — INTERNET SERVICES & PROTOCOLS Kevin Jeffay Fall 2021 Homework 1, August 18 Due: 11:15 AM, August 30 _______________________________________________________________________ Baby-steps...

Please refer to the attached files


HW 1 COMP 431 — INTERNET SERVICES & PROTOCOLS Kevin Jeffay Fall 2021 Homework 1, August 18 Due: 11:15 AM, August 30 _______________________________________________________________________ Baby-steps Towards the Construction of a Mail Server — Parsing in Python The goal for the first half of the course is to build a simple mail server and mail reader that will work with standard Internet mail servers using SMTP (Simple Mail Transfer Protocol) and POP (Post Office Protocol), and mail readers such as Gmail, Apple Mail, and certain versions of Thunderbird or Microsoft Outlook. This assignment is the first (very!) small piece of the mail server: a simple string parser that we will build upon in later assignments. At a high-level a mail server is simply a program that receives SMTP messages (text strings) from clients, processes the messages, and sends the results of the processing back to the client as a response to the message. In this abstract view of a mail server, a server is a program that executes a logically infinite loop wherein it receives a message, processes the message, and then waits for the next message. In this assignment you will develop the portion of the code that will be used by the mail server to process messages it receives. Specifically, you are to write a program to determine if a message (a text string) is a valid SMTP “MAIL FROM” message. This is a message that tells a mail server who (which user) is trying to send an email message. An SMTP message is simply a line of text that looks like the following: MAIL FROM: In the SMTP protocol, this message is called the “MAIL” or “MAIL FROM” message. This message is made up of three substrings: • a command/message name — the string “MAIL FROM:”, • a “reverse path” — a well-formed “email address” delimited by angle brackets (“<” and="" “="">”) that represents the sender of the message being mailed, and • a “CRLF” line terminator — the message must be terminated by the “carriage return-line feed character sequence” (the Linux “newline” character [which is not visible in the example above since it is a non-printable character]). The MAIL FROM message is part of the larger SMTP protocol. Protocols such as SMTP are typically specified more formally than the English description above by using a more mathematical specification notation. (These notations are, in essence, a textual form of the syntax diagrams — sometimes called “railroad diagrams” — that are used to specify the formal syntax of a programming language.) 2 For example, the formal description of the MAIL FROM message is:1 ::= “MAIL” “FROM:” ::= ::= "<"> ">" ::= "@" ::= ::= | ::= any one of the printable ASCII characters, but not any or ::= | "." ::= | ::= ::= any one of the 52 alphabetic characters A through Z in upper case and a through z in lower case ::= | ::= | ::= any one of the ten digits 0 through 9 ::= the newline character ::= the space or tab character ::= "<" |="" "="">" | "(" | ")" | "[" | "]" | "\" | "." | "," | ";" | ":" | "@" | """ In this notation: • Items appearing (in angle brackets) on the left-hand side of an expression are called tokens, • Anything in quotes is interpreted as a string or character that must appear exactly as written, • Anything in square brackets (“[” and “]”) is optional and is not required to be present, • The vertical bar “|” is interpreted as a logical ‘or’ operator and indicates a choice between components that are mutually exclusive. For example, the sample MAIL FROM message above conforms to the formal description and hence is a valid SMTP message (assuming it is terminated with a newline character — the default line termination “character” for Linux). The following strings do not conform to the formal description and would be rejected as invalid or illegal requests. MAIL FROM: MAIL FROM: mail from: The first string contains an invalid mailbox component and the second string contains an invalid path component. The third string is technically invalid because it does not contain the literal strings “MAIL” or “FROM” (in upper case). However, as a practical matter, it turns out that the third string will be recognized by most mail servers as a valid SMTP MAIL FROM message. That is, while the formal SMTP protocol definition requires messages to be in upper case, virtually all implementations of SMTP in servers allow messages to be in either upper or lower case. This illustrates one of the (frustrating!) realities of networking: often protocol implementations take liberties with the formal specifications and these liberties become de facto standards. 1 As an aside, this form of notation is a variation of a commonly used notation called Backus-Naur Form (BNF). You will often see the syntax of protocols expressed using BNF and variations on BNF. 3 As another example, notice that in the grammar above, the reverse path token must occur immediately after the literal string “FROM:”. That is, the grammar does not allow any space to appear between “FROM:” and the reverse-path token. However, it turns out that virtually all SMTP implementations don’t implement this rule in the grammar exactly as specified and instead allow any amount of “whitespace” (spaces and tabs) to appear between any of the elements of the MAIL FROM message. For example, technically, the following MAIL FROM messages would be invalid according to the formal specification because they either have “too much” whitespace or whitespace in disallowed locations: MAIL FROM: MAIL FROM: MAIL FROM: Nonetheless, most SMTP servers would consider all of these strings to be valid MAIL FROM messages and treat them as being equivalent to: MAIL FROM: Thus, the de facto standard grammar for the MAIL FROM message (the grammar implemented by most SMTP servers) is ::= “:” ::= “MAIL” | “mail” ::= “FROM” | “from” ::= | ::= | where “null” means “no character.” (Thus, a “nullspace” is zero or more whitespace characters.) The Assignment — A Python Programming Warmup Exercise For this assignment you are to write a Python program on Linux to read in lines of characters from standard input (i.e., the keyboard) and determine which lines, if any, are legal SMTP MAIL FROM messages. In computer-science-speak, what you are building is a parser to “recognize” strings that conform to the grammar for a MAIL FROM message. (And the process of processing input lines is called “parsing.”) Specifically, you are to implement a parser for the following grammar: ::= “MAIL” “FROM:” ::= | ::= the space or tab character ::= | :== no character ::= ::= "<"> ">" ::= "@" ::= ::= | ::= any one of the printable ASCII characters, but not any of or ::= | "." ::= | ::= ::= any one of the 52 alphabetic characters A through Z in upper case and a through z in lower case 4 ::= | ::= | ::= any one of the ten digits 0 through 9 ::= the newline character ::= "<" |="" "="">" | "(" | ")" | "[" | "]" | "\" | "." | "
Nov 07, 2021
SOLUTION.PDF

Get Answer To This Question

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here