Difference between revisions of "Docker environment at D4Science"

From Gcube Wiki
Jump to: navigation, search
(The load balancers architecture)
(Docker compose example)
Line 96: Line 96:
 
==== Docker compose example ====
 
==== Docker compose example ====
  
* Use the Open ASFA one
+
* The following is an example of a stack made by two services, one of them is talking to the other using its Docker service name:
 +
 
 +
  version: '3.6'
 +
 
 +
services:
 +
  conductor-server:
 +
    environment:
 +
      - CONFIG_PROP=conductor-swarm-config.properties
 +
    image: nubisware/conductor-server
 +
    networks:
 +
      - conductor-network
 +
      - haproxy-public
 +
    deploy:
 +
      mode: replicated
 +
      replicas: 2
 +
      endpoint_mode: dnsrr
 +
      placement:
 +
        constraints: [node.role == worker]
 +
      restart_policy:
 +
        condition: on-failure
 +
        delay: 5s
 +
        max_attempts: 3
 +
        window: 120s
 +
    configs:
 +
      - source: swarm-config
 +
        target: /app/config/conductor-swarm-config.properties
 +
    logging:
 +
      driver: "journald"
 +
 
 +
  conductor-ui:
 +
    environment:
 +
      - WF_SERVER=http://conductor-server:8080/api/
 +
    image: nubisware/conductor-ui
 +
    networks:
 +
      - conductor-network
 +
      - haproxy-public
 +
    deploy:
 +
      mode: replicated
 +
      replicas: 2
 +
      endpoint_mode: dnsrr
 +
      placement:
 +
        constraints: [node.role == worker]
 +
      restart_policy:
 +
        condition: on-failure
 +
        delay: 5s
 +
        max_attempts: 3
 +
        window: 120s
 +
 
 +
networks:
 +
  conductor-network:
 +
  haproxy-public:
 +
    external: True
 +
 
 +
configs:
 +
  swarm-config:
 +
    file: ./conductor-swarm-config.properties

Revision as of 14:25, 5 January 2021

D4Science docker infrastructure

A production cluster is available, based on Docker Swarm [1]. The cluster consists of:

  • three manager nodes
  • currently, five worker nodes.

The running services are exposed using a double set of HAPROXY load balancers:

  • A L4 layer, used to reach the http/https services exposed by the L7 layer
  • A L7 layer, running in the swarm, configured to dinamically resolve the backend names using the Docker internal DNS service

Provisioning of the Docker Swarm

The Swarm, with portainer [2] and the L7 HAPROXY [3] installation is managed by ansible, starting from the role [4]

The load balancers architecture

L4 load balancers

All the requests to public containerized services pass through two HAPROXY servers in a High Availability setup, acting as IP level 4 proxies. The basic configuration is

 frontend LB
   bind *:80
   mode http
   redirect scheme https if !{ ssl_fc }
 frontend lb4_swarm
      bind                  *:443
      mode                  tcp
      description           L4 swarm
      default_backend lb4_swarm_bk
 backend lb4_swarm_bk
      mode                  tcp
      balance               source
      hash-type             consistent
      option                tcp-check
      tcp-check             connect port 443 ssl send-proxy
      server docker-swarm1.int.d4science.net docker-swarm1.int.d4science.net:443 check fall 1 rise 1 inter 2s send-proxy sni req.ssl_sni
      server docker-swarm2.int.d4science.net docker-swarm2.int.d4science.net:443 check fall 1 rise 1 inter 2s send-proxy sni req.ssl_sni
      server docker-swarm3.int.d4science.net docker-swarm3.int.d4science.net:443 check fall 1 rise 1 inter 2s send-proxy sni req.ssl_sni

A client is guaranteed to be routed to the same haproxy backend instance, it that instance is alive.

L7 HAPROXY cluster

  • The https requests are handled by three HAPROXY instances.
  • Backend hostnames are resolved using the Docker internal DNS.
  • Backends can be single or multiple instances (if they support such a configuration)
  • Stick sessions can be managed through cookies, src address, etc
  • Rate limiting policies can be applied

A configuration snippet:

 frontend http
   bind *:443 ssl crt /etc/pki/haproxy alpn h2,http/1.1 accept-proxy
   bind *:80 accept-proxy
   mode http
   option http-keep-alive
   option forwardfor
   http-request add-header X-Forwarded-Proto https
   # HSTS (63072000 seconds)
   http-response set-header Strict-Transport-Security max-age=63072000
   acl lb_srv hdr(host) -i load-balanced-service.d4science.org
   redirect scheme https code 301 if !{ ssl_fc }
   use_backend lb_stack_name_bk if lb_srv
 backend lb_stack_name_bk
   mode http
   option httpchk
   balance leastconn
   http-check send meth HEAD uri / ver HTTP/1.1 hdr Host load-balanced-service.d4science.org
   http-check expect rstatus (2|3)[0-9][0-9]
   dynamic-cookie-key load-balanced-service
   cookie JSESSIONID prefix nocache dynamic
   server-template lb-service-name- 2 lb_stack_name-lb_service_name:80080  check inter 10s rise 2 fall 3 resolvers docker init-addr libc,none

Docker Stack

  • The services can be deployed into the Docker cluster as stacks [5] using a specially crafted compose file. The Open ASFA case can be used as a working example [6]
  • The web services must be connected to the haproxy-public network and setup to use the dnsrr deploy mode, to be discoverable by HAPROXY
  • HAPROXY must be configured to expose the service. Example of a two instances shinyproxy service:
backend shinyproxy_bck
   mode http
   option httpchk
   balance leastconn
   http-check send meth HEAD uri / ver HTTP/1.1 hdr Host localhost
   http-check expect rstatus (2|3)[0-9][0-9]
   stick on src
   stick-table type ip size 2m expire 180m
   server-template shinyproxy- 2 shinyproxy_shinyproxy:8080  check resolvers docker init-addr libc,none

Docker compose example

  • The following is an example of a stack made by two services, one of them is talking to the other using its Docker service name:
 version: '3.6'
services:
 conductor-server:
   environment:
     - CONFIG_PROP=conductor-swarm-config.properties
   image: nubisware/conductor-server
   networks:
     - conductor-network
     - haproxy-public
   deploy:
     mode: replicated
     replicas: 2
     endpoint_mode: dnsrr
     placement:
       constraints: [node.role == worker]
     restart_policy:
       condition: on-failure
       delay: 5s
       max_attempts: 3
       window: 120s
   configs:
     - source: swarm-config
       target: /app/config/conductor-swarm-config.properties
   logging:
     driver: "journald"
  conductor-ui:
   environment:
     - WF_SERVER=http://conductor-server:8080/api/
   image: nubisware/conductor-ui
   networks:
     - conductor-network
     - haproxy-public
   deploy:
     mode: replicated
     replicas: 2
     endpoint_mode: dnsrr
     placement:
       constraints: [node.role == worker]
     restart_policy:
       condition: on-failure
       delay: 5s
       max_attempts: 3
       window: 120s
networks:
  conductor-network:
  haproxy-public:
    external: True
configs:
  swarm-config:
    file: ./conductor-swarm-config.properties