Computes a Virtual Floristic List (VFL): taxa potentially occurring within a study site, with probabilities based on spatial uncertainty and temporal decay using inclusion-exclusion principle.

virtual_list(
  data_flor,
  site,
  year_study = NULL,
  excl_areas = NULL,
  CRS.new = 3035,
  tau,
  upperlimit = 20,
  min_probability = 0,
  verbose = TRUE,
  check_overlap = TRUE,
  output_dir = file.path(getwd(), "output"),
  output_prefix = "VFL"
)

Arguments

data_flor

Data frame with columns: 'Taxon', 'Long', 'Lat', 'uncertainty' (radius in meters), 'year'.

site

sf polygon or SpatialPolygonsDataFrame of study area.

year_study

Numeric year of analysis (default = current year).

excl_areas

Optional sf polygon(s) of unsuitable areas to exclude.

CRS.new

Numeric EPSG code for projected CRS (default = 3035).

tau

Percent taxa loss per 100 years (0 ≤ tau < 100).

upperlimit

Maximum number of records per taxon used in probability calculation (default = 20). Prevents computational explosion for well-sampled taxa. Higher values increase accuracy but dramatically slow computation:

  • 10 Very fast, good accuracy - for exploratory analysis

  • 20 Fast, very good accuracy - recommended default

  • 30 Slow, excellent accuracy - for publication-quality results

min_probability

Minimum probability threshold ( (default = 0). Set to 5-10 to filter unlikely taxa.

verbose

Logical; print progress messages (default = TRUE).

check_overlap

Logical; plot spatial overlap diagnostic (default = TRUE).

output_dir

Directory for output files (default = working directory).

output_prefix

Filename prefix (default = "VFL").

Value

List with:

  • VFLData frame: Taxon, probability, records, max, min.

  • StatisticsMetadata table.

  • PlotsNamed list of 2 ggplot objects (histograms).

  • spatial_datasf objects for further analysis.

Details

The function uses the inclusion-exclusion principle to aggregate probabilities from multiple records of the same taxon. For a taxon with n records, the computation requires 2^n combinations. The upperlimit parameter prevents exponential explosion: with n > upperlimit, only the upperlimit records with highest probabilities are used.

PDF Output Structure (2 pages, A4 landscape):

  • Page 1 Distribution analysis (probability + temporal histograms)

  • Page 2 Summary statistics table

Examples

if (FALSE) { # \dontrun{
# Load example data
data(floratus)
data(park)

# Basic usage (creates VFL_output.pdf and VFL_probabilities.csv)
vfl <- virtual_list(
  data_flor = floratus,
  site = park,
  tau = 30
)

# Custom naming (creates Park_output.pdf and Park_probabilities.csv)
vfl <- virtual_list(
  data_flor = floratus,
  site = park,
  tau = 30,
  output_prefix = "Park"
)

# Filter unlikely taxa (>5% probability only)
vfl_filtered <- virtual_list(
  data_flor = floratus,
  site = park,
  tau = 30,
  min_probability = 5
)

# High accuracy mode
vfl_accurate <- virtual_list(
  data_flor = floratus,
  site = park,
  tau = 30,
  upperlimit = 30
)
} # }