pablocastro great article and sample code. I'm wondering how to get around the prompt size limitation. I know the size will increase significantly with chatgpt4, but it's still limited, i.e. if the enterprise has 10 years worth of data in some storage, it's not feasible to feed it to the model with the prompt. The ideas you pointed out could help, but they still somehow might not represent the data in its entirety.
As an alternative, if absolute real-time isn't necessary, I was wondering if a scheduled training of the model would work. For example, export the enterprise data every night in the format that's used for training and train the model anew. This way there would be a one day delay in regards to the knowledge the model has, but it would know basically everything (and as much) as we want.
Thoughts on this approach? If you think it's feasible, is there a way to incrementally train the model, e.g. train once with the data of the past 10 years, then every day just the new data from yesterday in addition to what it already knows?