Now tracks states by passing around an empty struct as a signal to
states defined on the service, as opposed to signal state changes by
closing the channel. This makes sure services can only be in one
state at once and allows for multiple state changes which were not
possible before.
- add new metrics (off by default)
- add Debug boolean option for full metric details
- add Timezone option for 'running_since' field
- update default metrics shown and their constants
- remove utils.GetStartTime(), now use process metrics
CPU profiling changes:
cgr-engine.go:
- use filepath.Join instead of path.Join
- handle *CoreService.StopCPUProfiling error inside deferred function
- same with the error from *os.File.Close()
cores/core.go:
- StartCPUProfile now returns an *os.File (as opposed to an io.WriteCloser),
because os.File.Stat is used beforehand to check if a handler of the file is
already active and confirm the status of profiling. Asserting the type would
have worked as well.
- handle pprof.StartCPUProfile error and ensure file is closed before returning
- log file close error as a warning if it occurs
- return missing mandatory error with correct path field name ('DirPath')
- no need to check if fileCPU is nil for profiling status
- pprof.StartCPUProfiling will return an error if profiling is already started
- os.File.Close() will return ErrClosed if profiling is already stopped
- differentiate between calling StopCPUProfiling when profiling hasn't started
and when it was already stopped by returning appropriate errors
Memory profiling changes:
- merge StopChanMemProf with StopMemoryProfiling
- remove fileMEM and stopMemProf from struct and constructors
- add separate mutex for memory profiling, ensure thread safety
- handle all significant errors
- log error if StopMemoryProfiling fails during CoreS Shutdown
- ignore errors if profiling inactive in Shutdown and deferred Stop
- move validations inside V1 functions
- return error if StartMemoryProfiling already started
- return error if StopMemoryProfiling already stopped or never started
- close profiling loop on error, not the cgr-engine
- StopMemoryProfiling closes channel and profiling loop writes final profile
- rename Path to DirPath for mandatory field error
- rename memprof_nrfiles flag to memprof_maxfiles
- increase default memprof_interval
- consider MaxFiles <= 0 as unlimited
- move memory profiling logic after starting services
- use CoreService Start/StopMemoryProfiling in main
- remove final memory profile block (created by deferred Stop)
- convert MemProfiling to method on CoreService and rename to profileMemory
- use Ticker for recurrent actions instead of Timer
- compute mem_final.prof full path in StartMemoryProfiling
- suffix profile files with current time instead of numbers
- update dispatcher methods after changes
- move MemoryPrf from utils to cores, rename to MemoryProfilingParams
- add logs for starting/stopping profiling
- added the possibility to disable timestamps in the memory profile file names
and use increments of 1 instead.
Other changes:
- improved integration tests for flags (now table tests)
- improved profiling integration tests
Set redisPoolPipelineWindow to control duration before pipeline
flush (0 disables implicit pipelining) and redisPoolPipelineLimit
for max commands per pipeline (0 means no limit, only time window
applies).
- Updated to use the test suite
- Deleted kafka_ssl sample configuration (moved to test file)
- Revised Kafka server SSL setup comment
- Ensured exporters are synchronous to avoid missing errors
- Implemented helper function to create and clean up kafka topics
- ERs sometimes took too long to receive a message.
Setting kafkaGroupID to "" prevents this.
- Kafka reader took 10s to close (default MaxWait).
Set MaxWait to 1ms.
- Exporters took 1s each to export due to BatchSize
not being hit. Set BatchSize to 1 to prevent it.
Avoids the default 1 second delay when the batch doesn't
reach 100 messages within that time.
Useful when the Kafka exporter is not cached, as it would
otherwise encounter that delay. Setting BatchSize to 1
prevents this.
- renamed parameter type: ArgsReplyFailedPosts -> ReplayEventsParams
- renamed param fields:
- FailedRequestsInDir -> SourcePath
- FailedRequestsOutDir -> FailedPath
- TypeProvider -> Provider
- changed param fields types from *string to string
- used the SourcePath and FailedPath params directly instead of creating separate variables
- used filepath.WalkDir instead of reading the directory and looping over the entries
- used slices.ContainsFunc to check if the file belongs to any module (if 1+ is specified)
- used filepath.Join instead of path.Join
- used the path provided by WalkFunc instead of building the file paths ourselves
- made error returns more descriptive
- added logs for directories/files that are skipped
- paths that cannot be accessed are skipped after logging the error
- made use of the test setup helpers.
- used t.Cleanup instead of defer.
- instead of waiting 50ms for the nats-server to start, used a helper
hook to attempt connections in fibonacci intervals. On success it
keeps a reference to the connection for later usage.
- handle error for stream delete function executed during cleanup.
- shorten time.Sleep durations when waiting for exports to finish.
- extract the cache itemID checking logic into a separate func
- retry failed requests in fibonacci intervals for up to 500ms
Applies to both file readers and loader (for loader, the blank statement
was used anyway).
It's redundant because for file readers, the rdr.sourceDir value was
always passed as the parameter when it was already part of the method's
object.
Parameter had to also be removed from the WatchDir function and the
functions it depends on.