Changes between Version 6 and Version 7 of drmsGeneralAndHMIissue


Ignore:
Timestamp:
04/12/21 16:36:18 (3 years ago)
Author:
niles
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • drmsGeneralAndHMIissue

    v6 v7  
    5656There are several mechanisms in operation. First, if a user requests HMI data from a remote site, and the remote site happens to have the data locally (ie already copied to the remote site, either by a recent mirror request or a recent previous user request), then the downloads likely succeed (hence the top left points in the scatter plot). 
    5757 
    58 If the data need to be copied to the remote site to answer the user request, then it becomes possible that the user's connection to the remote site will time out prior to the copying of data from the JSOC completing. Worse, if there are many requests for HMI data in a short period, then the limit on the number of copies from the JSOC to the remote site may be reached, and copy requests will start to be queued. If this occurs, then requests for AIA data that happen to be made at the same time that the the remote site is busy with many requests for HMI data will also start to time out due to the bottleneck in copy requests. In that event, the requests for HMI data will have caused problems not only for HMI data, but for AIA data as well. 
     58If the data need to be copied to the remote site to answer the user request, then it becomes possible that the user's connection to the remote site will time out prior to the copying of data from the JSOC completing. This is likely happening for HMI data due to the bundled nature of HMI data making the copying of data to the remote site take longer. Worse, if there are many requests for HMI data in a short period, then the limit on the number of copies from the JSOC to the remote site may be reached, and copy requests will start to be queued. If this occurs, then requests for AIA data that happen to be made at the same time that the the remote site is busy with many requests for HMI data will also start to time out due to the bottleneck in copy requests. In that event, the requests for HMI data will have caused problems not only for HMI data, but for AIA data as well. 
    5959 
    60 There is some evidence that users have become aware that they need to ask for HMI data twice, with the first request failing due to timeout but succeeding at getting the data copied to the remote site so that subsequent requests will succeed. Examination of the apache server logs after one inciden at NSO during which the system was struggling to deliver HMI data seemed to support this, that the same user was coming back to retrieve the data requested on the first attempt. Users may not be aware of the underlying mechanism, but at least some experienced users have become aware that they may need to request data multiple times. 
     60There is some evidence that users have become aware that they need to ask for HMI data twice, with the first request failing due to timeout but succeeding at getting the data copied to the remote site so that subsequent requests will succeed. Examination of the apache server logs after one incident at NSO during which the system was struggling to deliver HMI data seemed to support this, that the same user was coming back to retrieve the data requested on the first attempt. Users may not be aware of the underlying mechanism, but at least some experienced users have become aware that they may need to request data multiple times. 
     61 
     62These mechanisms have been observed in near real time from web pages that monitor the status of downloads from the NSO remote site. Bad (non-200 status) downloads often correlate with requests for HMI data, with AIA data also being affected in more severe cases. 
    6163 
    6264=== 5.0 Possible solution ===