-
Notifications
You must be signed in to change notification settings - Fork 162
Diskover v2 Community Edition Install Guide
Below is an install guide for diskover v2 and diskover-web v2 community edition (ce). It is written for CentOS 7.x but could also be used as a rough-guide for how to install on Ubuntu or other Linux distros. If you are looking for documentation on how to use Diskover v2, see the v2 user guide.
- Installation How-to - diskover
- Installation How-to - diskover-web
- Updating Diskover v2 community edition to latest version
- Running Windows 10 Scanner
- Python 3.5+
- Elasticsearch 7.x
- PHP 7.x + PHP-FPM
- Nginx
- Disabling SELinux and using software firewall are optional and not required to run diskover.
- Internet access is required during install to download packages with yum.
- Apache could be used instead of Nginx but set up is not covered in this guide.
- Install CentOS 7.x (tested with CentOS 7.8 DVD iso using minimal install)
- Disable SELINUX (optional, not required to run diskover, if you use selinux you will need to adjust the selinux policies to allow diskover to run)
vi /etc/sysconfig/selinux
change SELINUX to disabled
reboot now
- Update Server
yum -y update
- Install Java 8 JDK (OpenJDK) (req. for ES)
yum -y install java-1.8.0-openjdk.x86_64
- Install ElasticSearch 7.x
yum install -y https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.10.2-x86_64.rpm
**Set JVM configuration (mem heap size)
vi /etc/elasticsearch/jvm.options
-Xms8g ** set to 50% of Memory, up to 32g max
-Xmx8g ** set to 50% of Memory, up to 32g max
**Set Firewall rules
firewall-cmd --add-port=9200/tcp --permanent
firewall-cmd --reload
**Update /etc/elasticsearch/elasticsearch.yml
network.host: ** leave commented out for localhost (default) or uncomment and set to the ip you want to bind to, using "0.0.0.0" will bind to all ips
discovery.seed_hosts: ** leave commented out for ["127.0.0.1", "[::1]"] (default) or uncomment and set to ["<host ip>"]
path.data: ** set to fast SSD path or other fast disk
path.logs: ** set to fast SSD path or other fast disk
bootstrap.memory_lock: true *** uncomment
**Update elasticsearch systemd service settings
mkdir /etc/systemd/system/elasticsearch.service.d
vi /etc/systemd/system/elasticsearch.service.d/elasticsearch.conf
**Add the text
[Service]
LimitMEMLOCK=infinity
LimitNPROC=4096
LimitNOFILE=65536
**
systemctl enable elasticsearch.service
systemctl start elasticsearch.service
systemctl status elasticsearch.service
- Install Kibana 7.x (optional)
yum install -y https://artifacts.elastic.co/downloads/kibana/kibana-7.10.2-x86_64.rpm
vi /etc/kibana/kibana.yml
**Uncomment and set the following line:
server.host: "<host ip>"
**Uncomment and set the following line if ES is not listening on localhost:
elasticsearch.hosts: ["http://<es host ip>:9200"]
**Set Firewall rules
firewall-cmd --add-port=5601/tcp --permanent
firewall-cmd --reload
systemctl enable kibana.service
systemctl start kibana.service
systemctl status kibana.service
For securing Elasticsearch and Kibana, see security guide.
- Install Python 3 (Python 3.6.8), Pip and dev tools
yum -y install python3 python3-devel gcc
python3 -V
pip3 -V
- Install Git
yum -y install git
- Install diskover
** Clone diskover community edition from GitHub repo
mkdir /tmp/diskover_install
git clone https://github.com/diskoverdata/diskover-community.git /tmp/diskover_install
cd /tmp/diskover_install
** Copy diskover files to opt
cp -a diskover /opt/
cd /opt/diskover
** Install required python dependencies
pip3 install -r requirements.txt
*** If indexing to AWS Elasticsearch run
pip3 install -r requirements-aws.txt
** Copy default/sample configs
for d in configs_sample/*; do d=`basename $d` && mkdir -p ~/.config/$d && cp configs_sample/$d/config.yaml ~/.config/$d/; done
** edit diskover config file
vi ~/.config/diskover/config.yaml
** set databases > elasticsearch > host to your elasticsearch hostname/ip
- Mount your network storage (set up client connection to storage)
*** for NFS
yum -y install nfs-utils
mkdir /mnt/nfsstor1
mount -t nfs -o ro,noatime,nodiratime server_name:/export_name /mnt/nfsstor1
*** for SMB/CIFS
yum -y install cifs-utils
mkdir /mnt/smbstor1
mount -t cifs -o username=user_name //server_name/share_name /mnt/smbstor1
- Run your first crawl
cd /opt/diskover
**start crawling
python3 diskover.py -i diskover-<indexname> <storage_top_dir>
- Install Nginx
yum -y install epel-release yum-utils
yum -y install http://rpms.remirepo.net/enterprise/remi-release-7.rpm
yum -y install nginx
systemctl enable nginx
systemctl start nginx
systemctl status nginx
- Install PHP 7 and PHP-FPM (fastcgi)
yum-config-manager --enable remi-php74
yum -y install php php-common php-fpm php-opcache php-pecl-mcrypt php-cli php-gd php-mysqlnd php-ldap php-pecl-zip php-xml php-xmlrpc php-mbstring php-json
vi /etc/php-fpm.d/www.conf
** change user = nginx and group = nginx
** uncomment and change listen.owner = nginx and listen.group = nginx
** change listen to listen = /var/run/php-fpm/php-fpm.sock
chown -R root:nginx /var/lib/php
chown -R nginx:nginx /var/run/php-fpm/
systemctl enable php-fpm
systemctl start php-fpm
systemctl status php-fpm
- Install diskover-web
** Clone diskover community edition from GitHub repo ** can skip this step if you did this already when installing diskover
mkdir /tmp/diskover_install
git clone https://github.com/diskoverdata/diskover-community.git /tmp/diskover_install
cd /tmp/diskover_install
** Copy web files to www
cp -a diskover-web /var/www/
** Edit diskover-web config
cd /var/www/diskover-web/src/diskover
cp Constants.php.sample Constants.php
vi Constants.php (diskover-web config file)
** set ES_HOST to your elasticsearch hostname/ip
** change PASS to a strong password (default diskover user password is darkdata)
chown -R nginx:nginx /var/www/diskover-web
** Create nginx config
vi /etc/nginx/conf.d/diskover-web.conf
*** add below text to diskover-web.conf
server {
listen 8000;
server_name diskover-web;
root /var/www/diskover-web/public;
index index.php index.html index.htm;
error_log /var/log/nginx/error.log;
access_log /var/log/nginx/access.log;
location / {
try_files $uri $uri/ /index.php?$args =404;
}
location ~ \.php(/|$) {
fastcgi_split_path_info ^(.+\.php)(/.+)$;
set $path_info $fastcgi_path_info;
fastcgi_param PATH_INFO $path_info;
try_files $fastcgi_script_name =404;
fastcgi_pass unix:/var/run/php-fpm/php-fpm.sock;
#fastcgi_pass 127.0.0.1:9000;
fastcgi_index index.php;
include fastcgi_params;
fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
include fastcgi_params;
fastcgi_read_timeout 900;
fastcgi_buffers 16 16k;
fastcgi_buffer_size 32k;
}
}
systemctl reload nginx
**open firewall ports for diskover-web
firewall-cmd --add-port=8000/tcp --permanent
firewall-cmd --reload
- View index in diskover-web after crawl finishes
http://<host_ip>:8000/
* default login is username: diskover and password: darkdata
* password can be set in web config file Constants.php
- Check for any errors in nginx log (e.g. permission issues)
tail -f /var/log/nginx/error.log
Make a backup of your existing config files (optional):
cd ~/.config/diskover && cp config.yaml config.yaml.bak
cd <diskover-web_dir>/src/diskover && cp Constants.php Constants.php.bak
If the diskover repo is no longer cloned in /tmp/diskover_install, clone again:
mkdir /tmp/diskover_install
git clone https://github.com/diskoverdata/diskover-community.git /tmp/diskover_install
Update local cloned repo and sync changes to installed locations:
cd /tmp/diskover_install
git pull
rsync -rcv diskover/ /opt/diskover/
rsync -rcv diskover-web/ /var/www/diskover-web/
chown -R nginx:nginx /var/www/diskover-web
Check your config files are not missing any new settings:
diff <diskover_dir>/configs_sample/diskover/config.yaml ~/.config/diskover/config.yaml
cd <diskover-web_dir>/src/diskover && diff Constants.php.sample Constants.php
Restart nginx and php-fpm
systemctl restart nginx
systemctl restart php-fpm
Check for any errors in nginx log (e.g. permission issues)
tail -f /var/log/nginx/error.log
-
Extract diskover zip file from ftp server to temp folder
-
Open a command prompt and copy diskover folder to program files
Xcopy C:\tmp\diskover "C:\Program Files\" /E /H /C /I
- Install Python
Get python 3.5+ from https://www.python.org/downloads/ or Windows Store and install
- Install Python Modules
open a command prompt (run as administrator)
cd "C:\Program Files\diskover"
pip3 install -r requirements-win.txt
*** If indexing to AWS Elasticsearch run
pip3 install -r requirements-aws.txt
- Copy default/sample configs
open a command prompt (run as administrator)
cd "C:\Program Files\diskover\configs_sample"
for /F %i in ('dir /b') do (mkdir %APPDATA%\%i & copy %i\config.yaml %APPDATA%\%i\)
- Setup diskover configuration file
Use Notepad to open the following configuration file
%APPDATA%\diskover\config.yaml
Setup Elastic Search Host Information
*** If using Elasticsearch in AWS
Set AWS to True (remove the # comment indicator)
aws: True
Setup AWS Elasticsearch url (remove the # comment indicator, and https://)
host: <es host endpoint>
Setup port to use AWS Port 443
port: 443
Configure Username
user: myusername
Configure Password
password: changeme
***
*** If using on-prem Elasticsearch instance
Set host information
host: <es host ip>
Set Elasticsearch port
port: 9200
***
Set replacepaths to True
replace: True
- Generate an index / scan
Open a command prompt, running as Administrator is optional if you need elevated privileges to scan/index all the files.
python3 diskover.py -i diskover-<indexname> <top path>