Server¶
Launch Methods¶
MS Windows script
start\AFANASY\_afserver.cmd
UNIX script
start/AFANASY/_afserver.sh
Linux daemon when Linux packages are installed
sudo systemctl start afserver
Setup CGRU environment and launch a command:
cd cgru source ./setup.sh afserver
System Job¶
System job is designed to execute system tasks on render farm (such as post commands). When server needs to execute some command it appends system job with a task.
Note
Your farm should be configured to execute have system services to execute job post commands (remove rendered scenes).
You can explore system job by Watch GUI in super user mode, to manipulate it’s parameters to control its running.

System Job

System Job Tasks
If error system task can’t be restarted (a number of error retries reached the maximum value) it will be deleted. It needed to prevent the growth of system tasks number.
You can watch system job log and its task log. When error occurs the log will be appended with the command output.
To reset system commands queue you can restart block or task.
Configuration¶
"af_sysjob_tasklife": 1800
Maximum system task age in seconds. If task age will equal to this number it will be treated as an error task. It needed to prevent the growth of tasks number, if some task(s) can’t be executed (restarted).
"af_sysjob_tasksmax": 1000
Maximum number of running or ready tasks. If number of tasks will equal to this number, no new tasks will be created. But commands will not be lost, they will be stored in special list, to wait for some tasks will be done. It needed to prevent the growth of tasks number, if system job running will be stopped for some time (may be all hosts appeared in black lists). Tasks need more memory and CPU time then a simple commands list.
"af_sysjob_postcmd_service": "postcmd"
Service type for Post Commands system block.
"af_sysjob_events_service":"events"
Service type for Events system block.
"af_sysjob_wol_service": "wakeonlan"
Service type for Wake-On-LAN system block.
"af_render_cmd_wolsleep": "wolsleep"
Sleep command performed by a render client.
"af_render_cmd_wolwake": "wolwake"
Wake command constructed by a server and performed on a online client by the system job.
Post Commands¶
Post commands are executed on a job deletion. It is designed to clean up temporary files, that are not needed w/o the job. In a most common case, it is a temporary scene file to render.
Most submission scripts copy (save) current scene to some temporary file. This way artist can continue to make and save modifications in the current opened scene during render. Scene will be rendered at the state it was submitted.
Post commands are executed by renders via server system job post_commands block.
Wake-On-LAN¶
You can setup Afanasy to Wake-On-LAN machines.
Wake-On-LAN work-flow:
- Server sends a message to client to ask him to sleep.
- Client receives message from server to sleep.
- Client executes a
wolsleep
command which can be customized in Afanasy configuration.- Client falls a sleep.
- Server does not receive updates from client and make it offline.
- Server “decides” to wake a render up.
- Server adds a task
wolwake mac1 .. macN
to system job wake-on-lan block. Command can be customized in Afanasy configuration.- Another online and ready render executes the task.
- This task sends magic packet for each mac address of a sleeping render to a broadcast address. It is a small Python script provided with CGRU.
- Render wakes up.
You can wake and sleep renders by afwatch GUI and afcmd command.
Events¶
Events are generated by server. When event happened, job and user data is pushed to event service as a command by JSON. If event is emitted by render, render and all parent pools will be written too. Event service Python class reads its command - JSON data and can generate any command to execute. So event task receives data by a command, do something with this data and can construct a real command to execute as a task process.
JOB_DONE¶
Some job became done.
JOB_ERROR¶
Some job task produced an error.
JOB_DELETED¶
Job has been deleted.
RENDER_ZOMBIE¶
Render stopped to send updates to server for zombie_time seconds.
RENDER_SICK¶
Render produced sick_errors_count errors from different users in a row and got SICK state.
RENDER_NO_TASK¶
Render has no task for no_task_event_time seconds.
RENDER_OVERLOAD¶
Render has low free memory or disk or swap. How much resources considered as low, you can configure by JSON config parameters:
af_render_overflow_mem
- percentage of a free memory.af_render_overflow_swap
- percentage of a free swap.af_render_overflow_hdd
- percentage of a free disk space.
By default this parameters are equal to -1
and this means that the resource check is disabled.
Practically good free percentage to emit event is 1
,
as an overloaded machine never reaches zero free memory or hdd.
The next time event will be emitted after overload_event_time seconds.
There is already default Python service class:
cgru/afanasy/python/services/events.py
It designed to send emails.
Example of a custom data to send emails:
{
"emails":["some@email.com"],
"events":
{
"JOB_ERROR":{"methods":["email"]},
"JOB_DONE":{"methods":["email"]}
}
}
User and job custom data objects are simple merged. So user can have information about email and job about events. If user will have email and events in custom data all it jobs will send emails.
You can write any custom Python service class, for example:
cgru/afanasy/python/services/events_local.py
And set it as System job events block service name in your configuration file:
"af_sysjob_events_service":"events_local"
Statistics¶
Afanasy server can store jobs and tasks statistics in SQL database. It uses PostgreSQL engine. On job deletion and task finish (with any result) server insert some job and task data into database tables.
Database Schema¶
afanasy=# \d jobs;
Table "public.jobs"
Column | Type | Collation | Nullable | Default
----------------+------------------------+-----------+----------+---------
annotation | character varying(512) | | |
blockname | character varying(512) | | |
capacity | integer | | | 0
description | character varying(512) | | |
folder | character varying(512) | | |
jobname | character varying(512) | | |
hostname | character varying(512) | | |
service | character varying(512) | | |
tasks_done | integer | | | 0
tasks_quantity | integer | | | 0
run_time_sum | bigint | | | 0
time_done | bigint | | | 0
time_started | bigint | | | 0
username | character varying(512) | | |
serial | bigint | | | 0
id_block | integer | | | 0
afanasy=# \d tasks;
Table "public.tasks"
Column | Type | Collation | Nullable | Default
---------------+-------------------------+-----------+----------+---------
annotation | character varying(512) | | |
blockname | character varying(512) | | |
capacity | integer | | | 0
command | character varying(4096) | | |
description | character varying(512) | | |
error | integer | | | 0
errors_count | integer | | | 0
folder | character varying(512) | | |
frame_pertask | bigint | | | 0
hostname | character varying(512) | | |
jobname | character varying(512) | | |
resources | character varying(4096) | | |
service | character varying(512) | | |
starts_count | integer | | | 0
time_done | bigint | | | 0
time_started | bigint | | | 0
username | character varying(512) | | |
serial | bigint | | | 0
id_block | integer | | | 0
id_task | integer | | | 0
Database Setup¶
Edit Postgre SQL client authentication configuration file pg_hba.conf.
Its location depends on Linux distributive. For example:
Debian, Ubuntu:
/etc/postgresql/ [version] /main/pg_hba.conf
CentOS, Fedora, openSUSE:
/var/lib/pgsql/data/pg_hba.conf
make install:
/usr/local/pgsql/data/pg_hba.conf
Add this line:
local afanasy afadmin password
Read comments in this file to know what does it mean. (If problems with authentication try trust for all methods.)Restart database
Create afanasy database and user
sudo su - postgres createdb afanasy psql afanasy CREATE USER afadmin PASSWORD 'AfPassword';
Create Tables¶
- Go into CGRU root folder:
cd /opt/cgru
- Source setup:
source ./setup.sh
- Check database connection:
afcmd db_check
- Program should output an error or print “Database connection is working” if everything is ok.
- Create required tables:
afcmd db_reset_all
- This command also delete old tables if they exists.
Server setup¶
You need to install a web server with PHP and PGSQL modules. Any Linux distribution have this packages.
In most Linux-es all this can be provided by packages:
apache2 libapache2-mod-php php php-pgsql
The site is located in cgru/afanasy/statistics
folder.
TIME-WAIT¶
TIME-WAIT is a special socket state, needed to ensure that all packages will not be lost. If server calls close() function first, its socket will fall into this state. To ensure that the connection last package is processed, it will wait:
TIME-WAIT = 2 * MSL (Maximum Segment Lifetime)
This is the reason why server should not call close() first. On a big amount of clients (~1000), application can reach 2^16 ports limit. Afanasy waits for about 2sec for client to close socket first. To check socket connected state we just try to write in it. SIGPIPE is ignored by Afanasy
To check sockets state you can:
netstat -nat | grep 51000 | wc -l
netstat -nat | egrep ':51000.*:.*TIME_WAIT' | wc -l
ss -tan state time-wait | wc -l
ss -tan 'sport = :51000' | awk '{print $(NF)" "$(NF-1)}' | sed 's/:[^ ]*//g' | sort | uniq -c