1 Star 1 Fork 0

AliyunContainerService / swarmkit

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
克隆/下载
贡献代码
同步代码
取消
提示: 由于 Git 不支持空文件夾,创建文件夹后会生成空的 .keep 文件
Loading...
README
Apache-2.0

SwarmKit

GoDoc Circle CI codecov.io Badge Badge

SwarmKit is a toolkit for orchestrating distributed systems at any scale. It includes primitives for node discovery, raft-based consensus, task scheduling and more.

Its main benefits are:

  • Distributed: SwarmKit uses the Raft Consensus Algorithm in order to coordinate and does not rely on a single point of failure to perform decisions.
  • Secure: Node communication and membership within a Swarm are secure out of the box. SwarmKit uses mutual TLS for node authentication, role authorization and transport encryption, automating both certificate issuance and rotation.
  • Simple: SwarmKit is operationally simple and minimizes infrastructure dependencies. It does not need an external database to operate.

Overview

Machines running SwarmKit can be grouped together in order to form a Swarm, coordinating tasks with each other. Once a machine joins, it becomes a Swarm Node. Nodes can either be worker nodes or manager nodes.

  • Worker Nodes are responsible for running Tasks using an Executor. SwarmKit comes with a default Docker Container Executor that can be easily swapped out.
  • Manager Nodes on the other hand accept specifications from the user and are responsible for reconciling the desired state with the actual cluster state.

An operator can dynamically update a Node's role by promoting a Worker to Manager or demoting a Manager to Worker.

Tasks are organized in Services. A service is a higher level abstraction that allows the user to declare the desired state of a group of tasks. Services define what type of task should be created as well as how to execute them (e.g. run this many replicas at all times) and how to update them (e.g. rolling updates).

Features

Some of SwarmKit's main features are:

  • Orchestration

    • Desired State Reconciliation: SwarmKit constantly compares the desired state against the current cluster state and reconciles the two if necessary. For instance, if a node fails, SwarmKit reschedules its tasks onto a different node.

    • Service Types: There are different types of services. The project currently ships with two of them out of the box

      • Replicated Services are scaled to the desired number of replicas.
      • Global Services run one task on every available node in the cluster.
    • Configurable Updates: At any time, you can change the value of one or more fields for a service. After you make the update, SwarmKit reconciles the desired state by ensuring all tasks are using the desired settings. By default, it performs a lockstep update - that is, update all tasks at the same time. This can be configured through different knobs:

      • Parallelism defines how many updates can be performed at the same time.
      • Delay sets the minimum delay between updates. SwarmKit will start by shutting down the previous task, bring up a new one, wait for it to transition to the RUNNING state then wait for the additional configured delay. Finally, it will move onto other tasks.
    • Restart Policies: The orchestration layer monitors tasks and reacts to failures based on the specified policy. The operator can define restart conditions, delays and limits (maximum number of attempts in a given time window). SwarmKit can decide to restart a task on a different machine. This means that faulty nodes will gradually be drained of their tasks.

  • Scheduling

    • Resource Awareness: SwarmKit is aware of resources available on nodes and will place tasks accordingly.

    • Constraints: Operators can limit the set of nodes where a task can be scheduled by defining constraint expressions. Multiple constraints find nodes that satisfy every expression, i.e., an AND match. Constraints can match node attributes in the following table. Note that engine.labels are collected from Docker Engine with information like operating system, drivers, etc. node.labels are added by cluster administrators for operational purpose. For example, some nodes have security compliant labels to run tasks with compliant requirements.

      node attribute matches example
      node.id node's ID node.id == 2ivku8v2gvtg4
      node.hostname node's hostname node.hostname != node-2
      node.ip node's IP address node.ip != 172.19.17.0/24
      node.role node's manager or worker role node.role == manager
      node.platform.os node's operating system node.platform.os == linux
      node.platform.arch node's architecture node.platform.arch == x86_64
      node.labels node's labels added by cluster admins node.labels.security == high
      engine.labels Docker Engine's labels engine.labels.operatingsystem == ubuntu 14.04
    • Strategies: The project currently ships with a spread strategy which will attempt to schedule tasks on the least loaded nodes, provided they meet the constraints and resource requirements.

  • Cluster Management

    • State Store: Manager nodes maintain a strongly consistent, replicated (Raft based) and extremely fast (in-memory reads) view of the cluster which allows them to make quick scheduling decisions while tolerating failures.
    • Topology Management: Node roles (Worker / Manager) can be dynamically changed through API/CLI calls.
    • Node Management: An operator can alter the desired availability of a node: Setting it to Paused will prevent any further tasks from being scheduled to it while Drained will have the same effect while also re-scheduling its tasks somewhere else (mostly for maintenance scenarios).
  • Security

    • Mutual TLS: All nodes communicate with each other using mutual TLS. Swarm managers act as a Root Certificate Authority, issuing certificates to new nodes.
    • Token-based Join: All nodes require a cryptographic token to join the swarm, which defines that node's role. Tokens can be rotated as often as desired without affecting already-joined nodes.
    • Certificate Rotation: TLS Certificates are rotated and reloaded transparently on every node, allowing a user to set how frequently rotation should happen (the current default is 3 months, the minimum is 30 minutes).

Build

Requirements:

SwarmKit is built in Go and leverages a standard project structure to work well with Go tooling. If you are new to Go, please see BUILDING.md for a more detailed guide.

Once you have SwarmKit checked out in your $GOPATH, the Makefile can be used for common tasks.

From the project root directory, run the following to build swarmd and swarmctl:

$ make binaries

Test

Before running tests for the first time, setup the tooling:

$ make setup

Then run:

$ make all

Usage Examples

Setting up a Swarm

These instructions assume that swarmd and swarmctl are in your PATH.

(Before starting, make sure /tmp/node-N don't exist)

Initialize the first node:

$ swarmd -d /tmp/node-1 --listen-control-api /tmp/node-1/swarm.sock --hostname node-1

Before joining cluster, the token should be fetched:

$ export SWARM_SOCKET=/tmp/node-1/swarm.sock  
$ swarmctl cluster inspect default  
ID          : 87d2ecpg12dfonxp3g562fru1
Name        : default
Orchestration settings:
  Task history entries: 5
Dispatcher settings:
  Dispatcher heartbeat period: 5s
Certificate Authority settings:
  Certificate Validity Duration: 2160h0m0s
  Join Tokens:
    Worker: SWMTKN-1-3vi7ajem0jed8guusgvyl98nfg18ibg4pclify6wzac6ucrhg3-0117z3s2ytr6egmmnlr6gd37n
    Manager: SWMTKN-1-3vi7ajem0jed8guusgvyl98nfg18ibg4pclify6wzac6ucrhg3-d1ohk84br3ph0njyexw0wdagx

In two additional terminals, join two nodes. From the example below, replace 127.0.0.1:4242 with the address of the first node, and use the <Worker Token> acquired above. In this example, the <Worker Token> is SWMTKN-1-3vi7ajem0jed8guusgvyl98nfg18ibg4pclify6wzac6ucrhg3-0117z3s2ytr6egmmnlr6gd37n. If the joining nodes run on the same host as node-1, select a different remote listening port, e.g., --listen-remote-api 127.0.0.1:4343.

$ swarmd -d /tmp/node-2 --hostname node-2 --join-addr 127.0.0.1:4242 --join-token <Worker Token>
$ swarmd -d /tmp/node-3 --hostname node-3 --join-addr 127.0.0.1:4242 --join-token <Worker Token>

If joining as a manager, also specify the listen-control-api.

$ swarmd -d /tmp/node-4 --hostname node-4 --join-addr 127.0.0.1:4242 --join-token <Manager Token> --listen-control-api /tmp/node-4/swarm.sock --listen-remote-api 127.0.0.1:4245

In a fourth terminal, use swarmctl to explore and control the cluster. Before running swarmctl, set the SWARM_SOCKET environment variable to the path of the manager socket that was specified in --listen-control-api when starting the manager.

To list nodes:

$ export SWARM_SOCKET=/tmp/node-1/swarm.sock
$ swarmctl node ls
ID                         Name    Membership  Status  Availability  Manager Status
--                         ----    ----------  ------  ------------  --------------
3x12fpoi36eujbdkgdnbvbi6r  node-2  ACCEPTED    READY   ACTIVE
4spl3tyipofoa2iwqgabsdcve  node-1  ACCEPTED    READY   ACTIVE        REACHABLE *
dknwk1uqxhnyyujq66ho0h54t  node-3  ACCEPTED    READY   ACTIVE
zw3rwfawdasdewfq66ho34eaw  node-4  ACCEPTED    READY   ACTIVE        REACHABLE

Creating Services

Start a redis service:

$ swarmctl service create --name redis --image redis:3.0.5
08ecg7vc7cbf9k57qs722n2le

List the running services:

$ swarmctl service ls
ID                         Name   Image        Replicas
--                         ----   -----        --------
08ecg7vc7cbf9k57qs722n2le  redis  redis:3.0.5  1/1

Inspect the service:

$ swarmctl service inspect redis
ID                : 08ecg7vc7cbf9k57qs722n2le
Name              : redis
Replicas          : 1/1
Template
 Container
  Image           : redis:3.0.5

Task ID                      Service    Slot    Image          Desired State    Last State                Node
-------                      -------    ----    -----          -------------    ----------                ----
0xk1ir8wr85lbs8sqg0ug03vr    redis      1       redis:3.0.5    RUNNING          RUNNING 1 minutes ago    node-1

Updating Services

You can update any attribute of a service.

For example, you can scale the service by changing the instance count:

$ swarmctl service update redis --replicas 6
08ecg7vc7cbf9k57qs722n2le

$ swarmctl service inspect redis
ID                : 08ecg7vc7cbf9k57qs722n2le
Name              : redis
Replicas          : 6/6
Template
 Container
  Image           : redis:3.0.5

Task ID                      Service    Slot    Image          Desired State    Last State                Node
-------                      -------    ----    -----          -------------    ----------                ----
0xk1ir8wr85lbs8sqg0ug03vr    redis      1       redis:3.0.5    RUNNING          RUNNING 3 minutes ago    node-1
25m48y9fevrnh77til1d09vqq    redis      2       redis:3.0.5    RUNNING          RUNNING 28 seconds ago    node-3
42vwc8z93c884anjgpkiatnx6    redis      3       redis:3.0.5    RUNNING          RUNNING 28 seconds ago    node-2
d41f3wnf9dex3mk6jfqp4tdjw    redis      4       redis:3.0.5    RUNNING          RUNNING 28 seconds ago    node-2
66lefnooz63met6yfrsk6myvg    redis      5       redis:3.0.5    RUNNING          RUNNING 28 seconds ago    node-1
3a2sawtoyk19wqhmtuiq7z9pt    redis      6       redis:3.0.5    RUNNING          RUNNING 28 seconds ago    node-3

Changing replicas from 1 to 6 forced SwarmKit to create 5 additional Tasks in order to comply with the desired state.

Every other field can be changed as well, such as image, args, env, ...

Let's change the image from redis:3.0.5 to redis:3.0.6 (e.g. upgrade):

$ swarmctl service update redis --image redis:3.0.6
08ecg7vc7cbf9k57qs722n2le

$ swarmctl service inspect redis
ID                   : 08ecg7vc7cbf9k57qs722n2le
Name                 : redis
Replicas             : 6/6
Update Status
 State               : COMPLETED
 Started             : 3 minutes ago
 Completed           : 1 minute ago
 Message             : update completed
Template
 Container
  Image              : redis:3.0.6

Task ID                      Service    Slot    Image          Desired State    Last State              Node
-------                      -------    ----    -----          -------------    ----------              ----
0udsjss61lmwz52pke5hd107g    redis      1       redis:3.0.6    RUNNING          RUNNING 1 minute ago    node-3
b8o394v840thk10tamfqlwztb    redis      2       redis:3.0.6    RUNNING          RUNNING 1 minute ago    node-1
efw7j66xqpoj3cn3zjkdrwff7    redis      3       redis:3.0.6    RUNNING          RUNNING 1 minute ago    node-3
8ajeipzvxucs3776e4z8gemey    redis      4       redis:3.0.6    RUNNING          RUNNING 1 minute ago    node-2
f05f2lbqzk9fh4kstwpulygvu    redis      5       redis:3.0.6    RUNNING          RUNNING 1 minute ago    node-2
7sbpoy82deq7hu3q9cnucfin6    redis      6       redis:3.0.6    RUNNING          RUNNING 1 minute ago    node-1

By default, all tasks are updated at the same time.

This behavior can be changed by defining update options.

For instance, in order to update tasks 2 at a time and wait at least 10 seconds between updates:

$ swarmctl service update redis --image redis:3.0.7 --update-parallelism 2 --update-delay 10s
$ watch -n1 "swarmctl service inspect redis"  # watch the update

This will update 2 tasks, wait for them to become RUNNING, then wait an additional 10 seconds before moving to other tasks.

Update options can be set at service creation and updated later on. If an update command doesn't specify update options, the last set of options will be used.

Node Management

SwarmKit monitors node health. In the case of node failures, it re-schedules tasks to other nodes.

An operator can manually define the Availability of a node and can Pause and Drain nodes.

Let's put node-1 into maintenance mode:

$ swarmctl node drain node-1

$ swarmctl node ls
ID                         Name    Membership  Status  Availability  Manager Status
--                         ----    ----------  ------  ------------  --------------
3x12fpoi36eujbdkgdnbvbi6r  node-2  ACCEPTED    READY   ACTIVE
4spl3tyipofoa2iwqgabsdcve  node-1  ACCEPTED    READY   DRAIN         REACHABLE *
dknwk1uqxhnyyujq66ho0h54t  node-3  ACCEPTED    READY   ACTIVE

$ swarmctl service inspect redis
ID                   : 08ecg7vc7cbf9k57qs722n2le
Name                 : redis
Replicas             : 6/6
Update Status
 State               : COMPLETED
 Started             : 2 minutes ago
 Completed           : 1 minute ago
 Message             : update completed
Template
 Container
  Image              : redis:3.0.7

Task ID                      Service    Slot    Image          Desired State    Last State                Node
-------                      -------    ----    -----          -------------    ----------                ----
8uy2fy8dqbwmlvw5iya802tj0    redis      1       redis:3.0.7    RUNNING          RUNNING 23 seconds ago    node-2
7h9lgvidypcr7q1k3lfgohb42    redis      2       redis:3.0.7    RUNNING          RUNNING 2 minutes ago     node-3
ae4dl0chk3gtwm1100t5yeged    redis      3       redis:3.0.7    RUNNING          RUNNING 23 seconds ago    node-3
9fz7fxbg0igypstwliyameobs    redis      4       redis:3.0.7    RUNNING          RUNNING 2 minutes ago     node-3
drzndxnjz3c8iujdewzaplgr6    redis      5       redis:3.0.7    RUNNING          RUNNING 23 seconds ago    node-2
7rcgciqhs4239quraw7evttyf    redis      6       redis:3.0.7    RUNNING          RUNNING 2 minutes ago     node-2

As you can see, every Task running on node-1 was rebalanced to either node-2 or node-3 by the reconciliation loop.

Apache License Version 2.0, January 2004 http://www.apache.org/licenses/ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 1. Definitions. "License" shall mean the terms and conditions for use, reproduction, and distribution as defined by Sections 1 through 9 of this document. "Licensor" shall mean the copyright owner or entity authorized by the copyright owner that is granting the License. "Legal Entity" shall mean the union of the acting entity and all other entities that control, are controlled by, or are under common control with that entity. For the purposes of this definition, "control" means (i) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the outstanding shares, or (iii) beneficial ownership of such entity. "You" (or "Your") shall mean an individual or Legal Entity exercising permissions granted by this License. "Source" form shall mean the preferred form for making modifications, including but not limited to software source code, documentation source, and configuration files. "Object" form shall mean any form resulting from mechanical transformation or translation of a Source form, including but not limited to compiled object code, generated documentation, and conversions to other media types. "Work" shall mean the work of authorship, whether in Source or Object form, made available under the License, as indicated by a copyright notice that is included in or attached to the work (an example is provided in the Appendix below). "Derivative Works" shall mean any work, whether in Source or Object form, that is based on (or derived from) the Work and for which the editorial revisions, annotations, elaborations, or other modifications represent, as a whole, an original work of authorship. For the purposes of this License, Derivative Works shall not include works that remain separable from, or merely link (or bind by name) to the interfaces of, the Work and Derivative Works thereof. "Contribution" shall mean any work of authorship, including the original version of the Work and any modifications or additions to that Work or Derivative Works thereof, that is intentionally submitted to Licensor for inclusion in the Work by the copyright owner or by an individual or Legal Entity authorized to submit on behalf of the copyright owner. For the purposes of this definition, "submitted" means any form of electronic, verbal, or written communication sent to the Licensor or its representatives, including but not limited to communication on electronic mailing lists, source code control systems, and issue tracking systems that are managed by, or on behalf of, the Licensor for the purpose of discussing and improving the Work, but excluding communication that is conspicuously marked or otherwise designated in writing by the copyright owner as "Not a Contribution." "Contributor" shall mean Licensor and any individual or Legal Entity on behalf of whom a Contribution has been received by Licensor and subsequently incorporated within the Work. 2. Grant of Copyright License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable copyright license to reproduce, prepare Derivative Works of, publicly display, publicly perform, sublicense, and distribute the Work and such Derivative Works in Source or Object form. 3. Grant of Patent License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable (except as stated in this section) patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer the Work, where such license applies only to those patent claims licensable by such Contributor that are necessarily infringed by their Contribution(s) alone or by combination of their Contribution(s) with the Work to which such Contribution(s) was submitted. If You institute patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Work or a Contribution incorporated within the Work constitutes direct or contributory patent infringement, then any patent licenses granted to You under this License for that Work shall terminate as of the date such litigation is filed. 4. Redistribution. You may reproduce and distribute copies of the Work or Derivative Works thereof in any medium, with or without modifications, and in Source or Object form, provided that You meet the following conditions: (a) You must give any other recipients of the Work or Derivative Works a copy of this License; and (b) You must cause any modified files to carry prominent notices stating that You changed the files; and (c) You must retain, in the Source form of any Derivative Works that You distribute, all copyright, patent, trademark, and attribution notices from the Source form of the Work, excluding those notices that do not pertain to any part of the Derivative Works; and (d) If the Work includes a "NOTICE" text file as part of its distribution, then any Derivative Works that You distribute must include a readable copy of the attribution notices contained within such NOTICE file, excluding those notices that do not pertain to any part of the Derivative Works, in at least one of the following places: within a NOTICE text file distributed as part of the Derivative Works; within the Source form or documentation, if provided along with the Derivative Works; or, within a display generated by the Derivative Works, if and wherever such third-party notices normally appear. The contents of the NOTICE file are for informational purposes only and do not modify the License. You may add Your own attribution notices within Derivative Works that You distribute, alongside or as an addendum to the NOTICE text from the Work, provided that such additional attribution notices cannot be construed as modifying the License. You may add Your own copyright statement to Your modifications and may provide additional or different license terms and conditions for use, reproduction, or distribution of Your modifications, or for any such Derivative Works as a whole, provided Your use, reproduction, and distribution of the Work otherwise complies with the conditions stated in this License. 5. Submission of Contributions. Unless You explicitly state otherwise, any Contribution intentionally submitted for inclusion in the Work by You to the Licensor shall be under the terms and conditions of this License, without any additional terms or conditions. Notwithstanding the above, nothing herein shall supersede or modify the terms of any separate license agreement you may have executed with Licensor regarding such Contributions. 6. Trademarks. This License does not grant permission to use the trade names, trademarks, service marks, or product names of the Licensor, except as required for reasonable and customary use in describing the origin of the Work and reproducing the content of the NOTICE file. 7. Disclaimer of Warranty. Unless required by applicable law or agreed to in writing, Licensor provides the Work (and each Contributor provides its Contributions) on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are solely responsible for determining the appropriateness of using or redistributing the Work and assume any risks associated with Your exercise of permissions under this License. 8. Limitation of Liability. In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall any Contributor be liable to You for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising as a result of this License or out of the use or inability to use the Work (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if such Contributor has been advised of the possibility of such damages. 9. Accepting Warranty or Additional Liability. While redistributing the Work or Derivative Works thereof, You may choose to offer, and charge a fee for, acceptance of support, warranty, indemnity, or other liability obligations and/or rights consistent with this License. However, in accepting such obligations, You may act only on Your own behalf and on Your sole responsibility, not on behalf of any other Contributor, and only if You agree to indemnify, defend, and hold each Contributor harmless for any liability incurred by, or claims asserted against, such Contributor by reason of your accepting any such warranty or additional liability. END OF TERMS AND CONDITIONS APPENDIX: How to apply the Apache License to your work. To apply the Apache License to your work, attach the following boilerplate notice, with the fields enclosed by brackets "{}" replaced with your own identifying information. (Don't include the brackets!) The text should be enclosed in the appropriate comment syntax for the file format. We also recommend that a file or class name and description of purpose be included on the same "printed page" as the copyright notice for easier identification within third-party archives. Copyright {yyyy} {name of copyright owner} Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

简介

SwarmKit is a toolkit for orchestrating distributed systems at any scale. It includes primitives for node discovery, raft-based consensus, task scheduling and more 展开 收起
Go
Apache-2.0
取消

发行版

暂无发行版

贡献者

全部

近期动态

加载更多
不能加载更多了
Go
1
https://gitee.com/AliyunContainerService/swarmkit.git
git@gitee.com:AliyunContainerService/swarmkit.git
AliyunContainerService
swarmkit
swarmkit
master

搜索帮助