1
0
Fork 0
Commit Graph

132 Commits

Author SHA1 Message Date
Sean Sube ccf8d51e08
feat(api): split up status endpoint by job status 2023-03-26 11:57:00 -05:00
Sean Sube ea36082e43
add job count to healthy worker logs 2023-03-26 11:53:06 -05:00
Sean Sube 8eab92a7df
define device on pending job 2023-03-26 11:49:58 -05:00
Sean Sube 83884bcafa
enqueue jobs on idle workers during progress check 2023-03-26 11:48:27 -05:00
Sean Sube 14ade83937
fix(api): enqueue next job when previous one finishes and after recycling worker 2023-03-26 11:41:45 -05:00
Sean Sube f3ab25f671
lint(api): add start method to worker pool 2023-03-26 11:30:07 -05:00
Sean Sube 2b179bebac
fix(api): always reset job counter when creating new device worker 2023-03-26 11:22:03 -05:00
Sean Sube 55e44e8ac9
fix(api): increment job counter for worker when it starts a new job (#283) 2023-03-26 11:18:27 -05:00
Sean Sube e552a5560f
feat(api): check device worker pool and recycle on a regular interval (#284) 2023-03-26 11:09:22 -05:00
Sean Sube aeb71ad50a
lint lock name 2023-03-26 08:30:34 -05:00
Sean Sube 95a61f3573
fix(api): restart worker threads when their respective queues are full 2023-03-25 13:46:12 -05:00
Sean Sube 88f4713e23
fix(api): use lock when restarting workers 2023-03-25 09:47:51 -05:00
Sean Sube 6b4c046867
pass pool to threads 2023-03-22 22:58:46 -05:00
Sean Sube 86c1b29c31
lint(api): extract worker thread main functions (#279) 2023-03-22 22:55:34 -05:00
Sean Sube 4dd68ea6b6
fix(api): restart worker threads if they crash 2023-03-22 19:58:46 -05:00
Sean Sube 0732058aa8
feat(api): detect Textual Inversion type from keys (#262) 2023-03-19 20:16:52 -05:00
Sean Sube aefa5b4613
fix(api): clear job cancelled flag when worker starts a new job (#269) 2023-03-19 17:57:14 -05:00
Sean Sube 2e89fd43d3
fix(api): only remove running jobs from running state 2023-03-18 19:21:40 -05:00
Sean Sube e5862d178c
fix(api): assume inversion tokens are embeddings for now 2023-03-18 18:35:11 -05:00
Sean Sube 1d52dc684d
init last progress on worker context 2023-03-18 17:27:41 -05:00
Sean Sube e08a9aa2ab
add pending job list to pool 2023-03-18 17:26:28 -05:00
Sean Sube 8cbdad3a71
feat(api): add pending field to image ready response 2023-03-18 17:25:13 -05:00
Sean Sube 15b6e036e1
fix(api): maintain list of pending jobs 2023-03-18 17:15:18 -05:00
Sean Sube 588c8c7fdb
fix(api): track last progress within worker 2023-03-18 15:32:49 -05:00
Sean Sube 5106dd48a9
remove another ref to finished queue 2023-03-18 15:27:07 -05:00
Sean Sube b026566ccb
remove remaining references to finished queue and worker 2023-03-18 15:26:19 -05:00
Sean Sube d1565b056e
apply lint, make missing images an error 2023-03-18 15:16:41 -05:00
Sean Sube 7cf5554bef
feat(api): add error flag to image ready response 2023-03-18 15:13:42 -05:00
Sean Sube aec540a524
feat(api): add server setting for CUDA memory limit (#211) 2023-03-18 13:40:37 -05:00
Sean Sube 226710a015
fix(api): use exception level logs 2023-03-16 22:29:07 -05:00
Sean Sube 4b832f3d8d
more lint, more trace 2023-03-16 20:22:20 -05:00
Sean Sube c8c5e9f42e
fix(api): handle more out-of-memory errors in the workers 2023-03-16 18:34:28 -05:00
Sean Sube b2eb406197
fix(api): handle CUDA memory errors in workers 2023-03-15 08:51:29 -05:00
Sean Sube 9555a7a3ea
lint(api): only log new worker message if some workers need to be restarted 2023-03-11 13:30:54 -06:00
Sean Sube 66c42485cb
feat(api): add support for extremely long prompts 2023-03-07 19:00:25 -06:00
Sean Sube 9d9bd1a639
apply lint 2023-03-07 08:02:53 -06:00
Sean Sube 35dc8a0bc4
improve exit logging 2023-03-05 21:37:39 -06:00
Sean Sube c0a01efef4
fix(api): track currently active worker for each device 2023-03-05 21:28:21 -06:00
Sean Sube 57fed94337
fix(api): exit worker on memory allocation errors 2023-03-05 21:11:33 -06:00
Sean Sube cb460a0c59
fix(api): add worker PID to log messages 2023-03-05 20:25:02 -06:00
Sean Sube 4ae4ce176c
fix(api): attempt to recycle leaking workers when a job finishes 2023-03-05 20:13:28 -06:00
Sean Sube 3a4928e59b
fix(api): prevent workers from blocking on their progress queues 2023-03-05 20:07:06 -06:00
Sean Sube edc55ae8b4
fix(api): finished job notification should not block worker 2023-03-05 19:53:44 -06:00
Sean Sube cfc20d3133
fix(api): improve cache logging 2023-03-05 19:30:52 -06:00
Sean Sube 39b9741b24
fix(api): show VRAM percent in logs 2023-03-05 19:23:23 -06:00
Sean Sube 7a3a81a4ef
fix(api): track and repeatedly attempt to recycle leaking workers (#219) 2023-03-05 18:58:13 -06:00
Sean Sube 1f3a5f6f3c
fix(api): track completed jobs for each device worker (#170) 2023-03-01 19:09:18 -06:00
Sean Sube 21fc7c5968
fix(api): mark all convert methods as no_grad 2023-03-01 08:26:40 -06:00
Sean Sube c99aa67220
name threads, max queues, type/lint fixes 2023-02-28 21:44:52 -06:00
Sean Sube c95ac1fbdd
avoid terminating workers because it breaks their queues 2023-02-28 08:53:17 -06:00
Sean Sube 0011f079d4
daemonize queue collectors 2023-02-28 06:55:15 -06:00
Sean Sube cad0d37604
some pending queue logging 2023-02-27 23:43:38 -06:00
Sean Sube 4ae3d9caa2
remove task done 2023-02-27 23:18:37 -06:00
Sean Sube 7e0ccdb1af
remove pending queues after joining 2023-02-27 23:14:20 -06:00
Sean Sube 1ce98ace33
add value error handling 2023-02-27 23:12:53 -06:00
Sean Sube da6ae5d62f
more logging around shutdown, close queues 2023-02-27 23:01:26 -06:00
Sean Sube 988088d64e
quit workers on keyboard signal 2023-02-27 22:52:43 -06:00
Sean Sube 953e5abd36
handle empty errors 2023-02-27 22:45:29 -06:00
Sean Sube 136759285d
set queue timeouts 2023-02-27 22:37:43 -06:00
Sean Sube 0793b61c3a
consistently pass job key to workers 2023-02-27 22:25:53 -06:00
Sean Sube 06f06f5a11
error handling in all threads 2023-02-27 19:48:51 -06:00
Sean Sube 113ad05293
typo 2023-02-27 17:36:26 -06:00
Sean Sube 2327b24022
join all threads 2023-02-27 17:35:31 -06:00
Sean Sube 66a20e60fe
run logger in a thread, clean up status 2023-02-27 17:14:53 -06:00
Sean Sube 13395933dc
always put progress in active jobs 2023-02-26 20:41:16 -06:00
Sean Sube a37d1a4550
use progress queue 2023-02-26 20:37:22 -06:00
Sean Sube 401ee20526
fix finished flag 2023-02-26 20:13:16 -06:00
Sean Sube 525ee24e91
track started and finished jobs 2023-02-26 20:09:42 -06:00
Sean Sube eb82e73e59
initialize list of finished jobs 2023-02-26 15:26:54 -06:00
Sean Sube b931da1d2c
fix imports, lint 2023-02-26 15:21:58 -06:00
Sean Sube 85118d17c6
clear worker flags between jobs, attempt to record finished jobs again 2023-02-26 15:06:40 -06:00
Sean Sube d1961afdbc
re-implement cancellation 2023-02-26 14:36:32 -06:00
Sean Sube 584dddb5d6
lint all the new stuff 2023-02-26 14:15:30 -06:00
Sean Sube b880b7a121
set process titles, terminate workers 2023-02-26 13:09:24 -06:00
Sean Sube 6502e1e3c8
recycle worker pool after 10 jobs 2023-02-26 12:58:38 -06:00
Sean Sube e0737e9e08
update progress and finished flag from worker 2023-02-26 12:51:11 -06:00
Sean Sube f115326da7
apply patches within workers 2023-02-26 12:32:48 -06:00
Sean Sube e1d0ad54b7
lock per worker, torch before ORT 2023-02-26 12:24:51 -06:00
Sean Sube d765a6f01b
make logger start up well 2023-02-26 11:16:33 -06:00
Sean Sube 6998e8735c
rejoin worker pool 2023-02-26 10:47:31 -06:00
Sean Sube 943281feb5
wire up worker jobs 2023-02-25 23:55:30 -06:00
Sean Sube f898de8c54
background workers, logger 2023-02-25 23:49:39 -06:00