Programing

DockerFile의 "VOLUME"명령어 이해

crosscheck 2020. 9. 23. 07:14
반응형

DockerFile의 "VOLUME"명령어 이해


아래는 내 "Dockerfile"의 내용입니다.

FROM node:boron

# Create app directory
RUN mkdir -p /usr/src/app

# change working dir to /usr/src/app
WORKDIR /usr/src/app

VOLUME . /usr/src/app

RUN npm install

EXPOSE 8080

CMD ["node" , "server" ]

이 파일에서 "VOLUME. / usr / src / app"명령이 호스트에있는 현재 작업 디렉토리의 내용이 컨테이너의 / usr / src / app 폴더에 마운트 될 것으로 예상합니다.

이것이 올바른 방법인지 알려주십시오.


공식 도커 튜토리얼은 다음과 같이 말합니다.

데이터 볼륨은 Union File System을 우회하는 하나 이상의 컨테이너 내에서 특별히 지정된 디렉토리입니다. 데이터 볼륨은 영구 또는 공유 데이터에 대한 몇 가지 유용한 기능을 제공합니다.

  • 컨테이너가 생성되면 볼륨이 초기화됩니다. 컨테이너의 기본 이미지에 지정된 마운트 지점
    의 데이터가 포함 된 경우 해당 기존 데이터는 볼륨
    초기화 시 새 볼륨으로 복사됩니다 . (호스트
    디렉토리를 마운트 할 때는 적용되지 않습니다 .)
  • 컨테이너간에 데이터 볼륨을 공유하고 재사용 할 수 있습니다.

  • 데이터 볼륨은 직접 변경됩니다.

  • 이미지를 업데이트 할 때 데이터 볼륨에 대한 변경 사항은 포함되지 않습니다.

  • 컨테이너 자체가 삭제 되더라도 데이터 볼륨은 유지됩니다.

에서는 컨테이너 내부Dockerfile볼륨 대상 만 지정할 수 있습니다 . 예 : ./usr/src/app

당신이 실행하면 컨테이너는 예 docker run --volume=/opt:/usr/src/app my_image당신은 할 수 있지만 설치 지점 (지정하지 / 옵션을 호스트 컴퓨터에서). --volume인수를 지정하지 않으면 마운트 지점이 자동으로 선택됩니다.


요컨대 : 아니요, 귀하의 VOLUME지침이 올바르지 않습니다.

Dockerfile은 VOLUME컨테이너 측 경로가 지정된 하나 이상의 볼륨을 지정합니다. 그러나 이미지 작성자가 호스트 경로를 지정할 수는 없습니다. 호스트 측에서 볼륨은 Docker 루트 내부에 매우 긴 ID와 유사한 이름으로 생성됩니다. 내 컴퓨터에서 이것은입니다 /var/lib/docker/volumes.

참고 : 자동 생성 된 이름은 매우 길고 사람의 관점에서 이해가되지 않기 때문에 이러한 볼륨을 종종 "이름 없음"또는 "익명"이라고합니다.

'.'를 사용하는 예 점을 첫 번째 또는 두 번째 인수로 설정해도 문자는 내 컴퓨터에서 실행되지 않습니다. 이 오류 메시지가 나타납니다.

docker : 데몬의 오류 응답 : oci 런타임 오류 : container_linux.go : 265 : 컨테이너 프로세스 시작으로 인해 "process_linux.go : 368 : 컨테이너 초기화로 인해 \"open / dev / ptmx : no such file or directory \ ""가 발생했습니다.

나는 지금까지 말한 내용이 이해하려는 사람에게 그다지 가치가 없을 것이며 VOLUME, -v당신이 성취하려는 것에 대한 해결책을 제공하지 않는다는 것을 알고 있습니다. 따라서 다음 예제가 이러한 문제에 대해 더 많은 정보를 제공 할 것입니다.

미니 튜토리얼 : 볼륨 지정

이 Dockerfile이 주어지면 :

FROM openjdk:8u131-jdk-alpine
VOLUME vol1 vol2

(이 미니 튜토리얼의 결과를 위해 우리가 명시 vol1 vol2하거나 /vol1 /vol2-이유를 묻지 말아 도 차이 가 없습니다)

구축 :

docker build -t my-openjdk

운영:

docker run --rm -it my-openjdk

컨테이너 내부에서 ls명령 줄에서 실행 하면 두 개의 디렉토리가 있음을 알 수 있습니다. /vol1/vol2.

컨테이너를 실행하면 호스트 측에 두 개의 디렉토리 또는 "볼륨"이 생성됩니다.

컨테이너 실행하면서, 실행 docker volume ls상의 호스트 컴퓨터 와이 같은 것을 볼 수 있습니다 (I은 간결 세 가지 점 이름의 중간 부분을 대체했다)

DRIVER    VOLUME NAME
local     c984...e4fc
local     f670...49f0

컨테이너로 돌아가서 실행하십시오 touch /vol1/weird-ass-file(해당 위치에 빈 파일을 만듭니다).

This file is now available on the host machine, in one of the unnamed volumes lol. It took me two tries because I first tried the first listed volume, but eventually I did find my file in the second listed volume, using this command on the host machine:

sudo ls /var/lib/docker/volumes/f670...49f0/_data

Similarly, you can try to delete this file on the host and it will be deleted in the container as well.

Note: The _data folder is also referred to as a "mount point".

Exit out from the container and list the volumes on the host. They are gone. We used the --rm flag when running the container and this option effectively wipes out not just the container on exit, but also the volumes.

Run a new container, but specify a volume using -v:

docker run --rm -it -v /vol3 my-openjdk

This adds a third volume and the whole system ends up having three unnamed volumes. The command would have crashed had we specified only -v vol3. The argument must be an absolute path inside the container. On the host-side, the new third volume is anonymous and resides together with the other two volumes in /var/lib/docker/volumes/.

It was stated earlier that the Dockerfile can not map to a host path which sort of pose a problem for us when trying to bring files in from the host to the container during runtime. A different -v syntax solves this problem.

Imagine I have a subfolder in my project directory ./src that I wish to sync to /src inside the container. This command does the trick:

docker run -it -v $(pwd)/src:/src my-openjdk

Both sides of the : character expects an absolute path. Left side being an absolute path on the host machine, right side being an absolute path inside the container. pwd is a command that "print current/working directory". Putting the command in $() takes the command within parenthesis, runs it in a subshell and yields back the absolute path to our project directory.

Putting it all together, assume we have ./src/Hello.java in our project folder on the host machine with the following contents:

public class Hello {
    public static void main(String... ignored) {
        System.out.println("Hello, World!");
    }
}

We build this Dockerfile:

FROM openjdk:8u131-jdk-alpine
WORKDIR /src
ENTRYPOINT javac Hello.java && java Hello

We run this command:

docker run -v $(pwd)/src:/src my-openjdk

This prints "Hello, World!".

The best part is that we're completely free to modify the .java file with a new message for another output on a second run - without having to rebuild the image =)

Final remarks

I am quite new to Docker, and the aforementioned "tutorial" reflects information I gathered from a 3-day command line hackathon. I am almost ashamed I haven't been able to provide links to clear English-like documentation backing up my statements, but I honestly think this is due to a lack of documentation and not personal effort. I do know the examples work as advertised using my current setup which is "Windows 10 -> Vagrant 2.0.0 -> Docker 17.09.0-ce".

The tutorial does not solve the problem "how do we specify the container's path in the Dockerfile and let the run command only specify the host path". There might be a way, I just haven't found it.

Finally, I have a gut feeling that specifying VOLUME in the Dockerfile is not just uncommon, but it's probably a best practice to never use VOLUME. For two reasons. The first reason we have already identified: We can not specify the host path - which is a good thing because Dockerfiles should be very agnostic to the specifics of a host machine. But the second reason is people might forget to use the --rm option when running the container. One might remember to remove the container but forget to remove the volume. Plus, even with the best of human memory, it might be a daunting task to figure out which of all anonymous volumes are safe to remove.


The VOLUME command in a Dockerfile is quite legit, totally conventional, absolutely fine to use and it is not deprecated in anyway. Just need to understand it.

We use it to point to any directories which the app in the container will write to a lot. We don't use VOLUME just because we want to share between host and container like a config file.

The command simply needs one param; a path to a folder, relative to WORKDIR if set, from within the container. Then docker will create a volume in its graph(/var/lib/docker) and mount it to the folder in the container. Now the container will have somewhere to write to with high performance. Without the VOLUME command the write speed to the specified folder will be very slow because now the container is using it's copy on write strategy in the container itself. The copy on write strategy is a main reason why volumes exist.

If you mount over the folder specified by the VOLUME command, the command is never run because VOLUME is only executed when the container starts, kind of like ENV.

Basically with VOLUME command you get performance without externally mounting any volumes. Data will save across container runs too without any external mounts. Then when ready simply mount something over it.

Some good example use cases:
- logs
- temp folders

Some bad use cases:
- static files
- configs
- code


Specifying a VOLUME line in a Dockerfile configures a bit of metadata on your image, but how that metadata is used is important.

First, what did these two lines do:

WORKDIR /usr/src/app
VOLUME . /usr/src/app

The WORKDIR line there creates the directory if it doesn't exist, and updates some image metadata to specify all relative paths, along with the current directory for commands like RUN will be in that location. The VOLUME line there specifies two volumes, one is the relative path ., and the other is /usr/src/app, both just happen to be the same directory. Most often the VOLUME line only contains a single directory, but it can contain multiple as you've done, or it can be a json formatted array.

You cannot specify a volume source in the Dockerfile: A common source of confusion when specifying volumes in a Dockerfile is trying to match the runtime syntax of a source and destination at image build time, this will not work. The Dockerfile can only specify the destination of the volume. It would be a trivial security exploit if someone could define the source of a volume since they could update a common image on the docker hub to mount the root directory into the container and then launch a background process inside the container as part of an entrypoint that adds logins to /etc/passwd, configures systemd to launch a bitcoin miner on next reboot, or searches the filesystem for credit cards, SSNs, and private keys to send off to a remote site.

What does the VOLUME line do? As mentioned, it sets some image metadata to say a directory inside the image is a volume. How is this metadata used? Every time you create a container from this image, docker will force that directory to be a volume. If you do not provide a volume in your run command, or compose file, the only option for docker is to create an anonymous volume. This is a local named volume with a long unique id for the name and no other indication for why it was created or what data it contains (anonymous volumes are were data goes to get lost). If you override the volume, pointing to a named or host volume, your data will go there instead.

VOLUME breaks things: You cannot disable a volume once defined in a Dockerfile. And more importantly, the RUN command in docker is implemented with temporary containers. Those temporary containers will get a temporary anonymous volume. That anonymous volume will be initialized with the contents of your image. Any writes inside the container from your RUN command will be made to that volume. When the RUN command finishes, changes to the image are saved, and changes to the anonymous volume are discarded. Because of this, I strongly recommend against defining a VOLUME inside the Dockerfile. It results in unexpected behavior for downstream users of your image that wish to extend the image with initial data in volume location.

How should you specify a volume? To specify where you want to include volumes with your image, provide a docker-compose.yml. Users can modify that to adjust the volume location to their local environment, and it captures other runtime settings like publishing ports and networking.

Someone should document this! They have. Docker includes warnings on the VOLUME usage in their documentation on the Dockerfile along with advice to specify the source at runtime:

  • Changing the volume from within the Dockerfile: If any build steps change the data within the volume after it has been declared, those changes will be discarded.

...

  • The host directory is declared at container run-time: The host directory (the mountpoint) is, by its nature, host-dependent. This is to preserve image portability, since a given host directory can’t be guaranteed to be available on all hosts. For this reason, you can’t mount a host directory from within the Dockerfile. The VOLUME instruction does not support specifying a host-dir parameter. You must specify the mountpoint when you create or run the container.

To better understand the volume instruction in dockerfile, let us learn the typical volume usage in mysql official docker file implementation.

VOLUME /var/lib/mysql

Reference: https://github.com/docker-library/mysql/blob/3362baccb4352bcf0022014f67c1ec7e6808b8c5/8.0/Dockerfile

The /var/lib/mysql is the default location of MySQL that store data files.

When you run test container for test purpose only, you may not specify its mounting point,e.g.

docker run mysql:8

then the mysql container instance will use the default mount path which is specified by the volume instruction in dockerfile. the volumes is created with a very long ID-like name inside the Docker root, this is called "unnamed" or "anonymous" volume. In the folder of underlying host system /var/lib/docker/volumes.

/var/lib/docker/volumes/320752e0e70d1590e905b02d484c22689e69adcbd764a69e39b17bc330b984e4

This is very convenient for quick test purposes without the need to specify the mounting point, but still can get best performance by using Volume for data store, not the container layer.

For a formal use, you will need to specify the mount path by overriding the mount point to use named volume, e.g.

docker run  -v /my/own/datadir:/var/lib/mysql mysql:8

The command mounts the /my/own/datadir directory from the underlying host system as /var/lib/mysql inside the container. The data directory /my/own/datadir won't be automatically deleted, even the container is deleted.

Usage of the mysql official image: Reference:https://hub.docker.com/_/mysql/

참고URL : https://stackoverflow.com/questions/41935435/understanding-volume-instruction-in-dockerfile

반응형