blog.smigiel.dev

Oct 01, 2020

Web Applications

Basics

During building web applications we need to focus on several fundamental aspects:

  1. HTTP Protocol Basics
  2. Cookies
  3. Sessions
  4. Same Origin Policy

HTTP Protocol Basics

HTTP (Hypertext Transfer Protocol) is the most used protocol on the Internet. Usually, the client (browser), connects to the server (Apache, nginx, ISS. During an HTTP communication, the client and the server exchange messages. HTTP works on top of TCP. That means, first a TCP connection is established, and then the client sends its request, and waits for the answer. The server processes the request and sends back its answer, providing a status code and appropriate data.

HTTP Request

To send HTTP request, usually we use browsers. However, it is not the only way of doing so. To build request, we can use netcat or telnet. Below is an example of HTTP request, with nc requesting website running on localhost on port 8000.

$ nc localhost 8000
GET / HTTP/1.0
Host: localhost

HTTP/1.0 200 OK
Server: SimpleHTTP/0.6 Python/3.7.3
Date: Fri, 02 Oct 2020 01:44:49 GMT
Content-type: text/html
Content-Length: 26612
Last-Modified: Fri, 02 Oct 2020 01:44:33 GMT

<!DOCTYPE html>
[...]

First line contains HTTP request method, path and protocol version. HTTP request method is information about the type of the request. Other requests are: PUT, POST, HEAD, etc. List of available metods can be found on Mozilla website Path tells the server which resource to fetch, while protocol version tells how to communicate with the client. Next, there is Host header field, which specifies the Internet hostname and port number of the resource being requested.

We can also create one-liner to retrieve appropriate information

$ echo -en 'HEAD / HTTP/1.1\r\nHost:localhost\r\n\r\n' | nc localhost 8000
HTTP/1.0 200 OK
Server: SimpleHTTP/0.6 Python/3.7.3
Date: Fri, 02 Oct 2020 02:01:34 GMT
Content-type: text/html
Content-Length: 27806
Last-Modified: Fri, 02 Oct 2020 01:55:46 GMT

Additionally, we can give extra header values like User-Agent (identifies the client and the system), Accept (specifies document type the client is expecting in the response) or Connection: keep-alive (future communications with the server will reuse the current connection).

HTTP Response

HTTP response, similar to request, has common format.

$ nc localhost 8000
GET / HTTP/1.0
Host: localhost

HTTP/1.0 200 OK
Server: SimpleHTTP/0.6 Python/3.7.3
Date: Fri, 02 Oct 2020 01:44:49 GMT
Content-type: text/html
Content-Length: 26612
Last-Modified: Fri, 02 Oct 2020 01:44:33 GMT

<!DOCTYPE html>
[...]

The first line is Status-Line which consists of protocol version (HTTP 1.0) followed by a numeric status code (200) and textual meaning (OK). There are many codes. Description, again can be found on Mozilla website. Next, there is additional information regarding server, date and time at which the message was originated, followed by page content.

HTTP Secure (HTTPS)

HTTP over SSL/TLS is a method to run clear-text HTTP with extra cryptographic protocol, to prevent from sniffing. By doing so, entire traffic is being encrypted. It means, that even if someone can intercept traffic, they won't be able to see what kind of information is being transmitted. The only non-encrypted pieces of information would be:

  • target IP address
  • target port
  • DNS or similar protocols (domain resolvers)

To analyze connection, we cannot use nc anymore, which doesn't support SSL. To work with SSL, we can use openssl

$ nc blog.smigiel.dev 443
HEAD / HTTP/1.1
$
$ openssl s_client -connect blog.smigiel.dev:443 -quiet
depth=2 O = Digital Signature Trust Co., CN = DST Root CA X3
verify return:1
depth=1 C = US, O = Let's Encrypt, CN = Let's Encrypt Authority X3
verify return:1
depth=0 CN = blog.smigiel.dev
verify return:1
HEAD / HTTP/1.1
Host: blog.smigiel.dev

HTTP/1.1 200 OK
Connection: keep-alive
Content-Length: 30082
Server: GitHub.com
Content-Type: text/html; charset=utf-8
Last-Modified: Wed, 30 Sep 2020 03:07:04 GMT
ETag: "5f73f658-7582"
Access-Control-Allow-Origin: *
Expires: Fri, 02 Oct 2020 01:14:17 GMT
Cache-Control: max-age=600
X-Proxy-Cache: MISS
X-GitHub-Request-Id: 376E:4248:1F613:26B71:5F767C91
Accept-Ranges: bytes
Date: Fri, 02 Oct 2020 02:57:10 GMT
Via: 1.1 varnish
Age: 49
X-Served-By: cache-pao17443-PAO
X-Cache: HIT
X-Cache-Hits: 1
X-Timer: S1601607430.114992,VS0,VE1
Vary: Accept-Encoding
X-Fastly-Request-ID: b902c4a5a2f4410e8e6ac21e60fa2e659f1e8300

HTTP Cookies

HTTP is stateless protocol. It means, that website cannot keep the state of a visit across different HTTP requests. Every HTTP request is unrelated to others. To change this situation, cookies were introduced.

Cookies are textual information installed by a website into web browser. Server can set a cookie using Set-Cookie header field.

Cookie fields

Cookies are only sent to the valid domain and path when they are not expired and according to their flags. The domain field and the path field set the scope of the cookie. The browser sends the cookie only if the request is for the right domain. If the server does not specify domain attribute, the browser will automatically set the domain as the server domain and set the cookie host-only flag. It means that the cookie will be sent only to that precise hostname.

Cookie protocol

Cookies are usually set during a login. Browser sends POST request to the server, and the server responds with a Set-Cookie header field. For every subsequent request, the browser considers domain, path, expiration and flags. If all checks pass, the browser will insert a cookie: header in the request.

Sessions

Sessions mechanism works similar to cookies. The difference here relies on the way how the information is stored. Session is kept on server-side. Each user session is identified by a session id which the server assigns to the user. The client then presents its ID for each subsequent request, thus being recognized by the server. The server retrieves the state of the client and all its associated variables.

Same-Origin Policy (SOP)

Same-Origin Policy prevents JavaScript code from getting or setting properties on a resource comming from a different origin. To determine if JavaScript can access a resource hostname, port and protocol must match. SOP applies only to the actual code. It is still possible to include external resources by other HTML tags like img, script, iframe, object, etc.

Additional resources