README.md 3.65 KB
Newer Older
Patrick Jentsch's avatar
Patrick Jentsch committed
1
# nopaque
Patrick Jentsch's avatar
Patrick Jentsch committed
2

Patrick Jentsch's avatar
Patrick Jentsch committed
3
nopaque bundles various tools and services that provide humanities scholars with DH methods and thus can support their various individual research processes. Using nopaque, researchers can subject digitized sources to Optical Character Recognition (OCR). The resulting text files can then be used as a data basis for Natural Language Processing (NLP). The texts are automatically subjected to various linguistic annotations. The data processed via NLP can then be summarized in the web application as corpora and analyzed by means of an information retrieval system through complex search queries. The range of functions of the web application will be successively extended according to the needs of the researchers.
Stephan Porada's avatar
Stephan Porada committed
4

Patrick Jentsch's avatar
Patrick Jentsch committed
5
## Prerequisites and requirements
Stephan Porada's avatar
Stephan Porada committed
6

7
8
1. Install docker for your system. Following the official instructions.
2. Install docker-compose. Following the official instructions.
Stephan Porada's avatar
Stephan Porada committed
9

Patrick Jentsch's avatar
Patrick Jentsch committed
10

Patrick Jentsch's avatar
Patrick Jentsch committed
11
## Configuration and startup
Patrick Jentsch's avatar
Patrick Jentsch committed
12

Patrick Jentsch's avatar
Patrick Jentsch committed
13
### **Create Docker swarm**
Stephan Porada's avatar
Stephan Porada committed
14

Patrick Jentsch's avatar
Patrick Jentsch committed
15
The generated computational workload is handled by a [Docker](https://docs.docker.com/) swarm. A swarm is a group of machines that are running Docker and joined into a cluster. It consists out of two different kinds of members, manager and worker nodes. The swarm setup process is described best in the [Docker documentation](https://docs.docker.com/engine/swarm/swarm-tutorial/).
Stephan Porada's avatar
Stephan Porada committed
16

Patrick Jentsch's avatar
Patrick Jentsch committed
17
### **Create network storage**
Stephan Porada's avatar
Stephan Porada committed
18

Patrick Jentsch's avatar
Patrick Jentsch committed
19
A shared network space is necessary so that all swarm members have access to all the data. To achieve this a [samba](https://www.samba.org/) share is used.
Stephan Porada's avatar
Stephan Porada committed
20
``` bash
Patrick Jentsch's avatar
Patrick Jentsch committed
21
22
# Example: Create a Samba share via Docker
# More details can be found under https://hub.docker.com/r/dperson/samba/
23
username@hostname:~$ sudo mkdir -p /srv/samba/nopaque
Patrick Jentsch's avatar
Patrick Jentsch committed
24
25
username@hostname:~$ docker run \
                       --name opaque_storage \
26
27
                       -v /srv/samba/nopaque:/srv/samba/nopaque \
                       -p 139:139 \
Patrick Jentsch's avatar
Patrick Jentsch committed
28
29
                       -p 445:445 \
                       dperson/samba \
30
                         -p -r -s "nopaque;/srv/samba/nopaque;no;no;no;nopaque" -u "nopaque;nopaque"
Patrick Jentsch's avatar
Patrick Jentsch committed
31

Patrick Jentsch's avatar
Patrick Jentsch committed
32
# Mount the Samba share on all swarm nodes (managers and workers)
Patrick Jentsch's avatar
Patrick Jentsch committed
33
username@hostname:~$ sudo mkdir /mnt/nopaque
34
username@hostname:~$ sudo mount --types cifs --options gid=${USER},password=nopaque,uid=${USER},user=nopaque,vers=3.0 //<SAMBA-SERVER-IP>/nopaque /mnt/nopaque
Patrick Jentsch's avatar
Patrick Jentsch committed
35
```
Patrick Jentsch's avatar
Patrick Jentsch committed
36

Patrick Jentsch's avatar
Patrick Jentsch committed
37
### **Download, configure and build nopaque**
Patrick Jentsch's avatar
Patrick Jentsch committed
38

Stephan Porada's avatar
Stephan Porada committed
39
``` bash
Patrick Jentsch's avatar
Patrick Jentsch committed
40
# Clone the nopaque repository
Patrick Jentsch's avatar
Patrick Jentsch committed
41
username@hostname:~$ git clone https://gitlab.ub.uni-bielefeld.de/sfb1288inf/nopaque.git
42
43
# Create data directories for the database and message queue
username@hostname:~$ mkdir data/{db,mq}
44
username@hostname:~$ cp db.env.tpl db.env
Patrick Jentsch's avatar
Patrick Jentsch committed
45
username@hostname:~$ cp .env.tpl .env
46
47
# Fill out the variables within these files.
username@hostname:~$ <YOUR EDITOR> db.env
Patrick Jentsch's avatar
Patrick Jentsch committed
48
username@hostname:~$ <YOUR EDITOR> .env
49
# Create docker-compose.override.yml file
50
51
username@hostname:~$ touch docker-compose.override.yml
# Tweak the docker-compose.override.yml to satisfy your needs. (You can find examples in docker-compose.<example>.yml)
Patrick Jentsch's avatar
Patrick Jentsch committed
52
username@hostname:~$ <YOUR EDITOR> docker-compose.override.yml
Patrick Jentsch's avatar
Patrick Jentsch committed
53
54
# Build docker images
username@hostname:~$ docker-compose build
Stephan Porada's avatar
Stephan Porada committed
55
56
```

Patrick Jentsch's avatar
Patrick Jentsch committed
57
58
### Start your instance
``` bash
59
60
# Create log files
touch nopaque.log nopaqued.log
61
# For background execution add the -d flag
Patrick Jentsch's avatar
Patrick Jentsch committed
62
username@hostname:~$ docker-compose up
63
# To scale your app use the following command after starting it normally
64
65
66
67
68
username@hostname:~$ docker-compose -f docker-compose.yml \
                                    -f docker-compose.override.yml
                                    -f docker-compose.scale.yml
                                    up
                                    -d --no-recreate --scale nopaque=<NUM_INSTANCES>
Patrick Jentsch's avatar
Patrick Jentsch committed
69
```